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Thursday, June 1 3, 2002 3:34 PM Paae 1 

GD2403 pLNBIv-G .MPD (1 > 7880) Site and Sequence a 
Enzymes : 36 of 538 enzymes (Filtered) 

Settings : Circular, Certain Sites Only, Standard Genetic Code 

Pstl 

I 

GAAACCAGCAGCGGCTATCCGCGCATCCAT GCCCCCGAACTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGC 

H ' 1 ' ■ 1 1 I 1 1 1 l- H h \~~ — i — — H — — h ' — K— I — H — -—h — -— H- 2800 

CTTTGGTCGTCGCCGATAGGCGCGTAGGTACGGGGGCTTGACGTCCTCACCCCTCCGTGCTACCGGCGAAACCAGCTCCG 
BamHI 

GGATCCTAGCAGAAAAATAAGACTTGATTCCCCCTTAAAATTACAACTGCTAGAAAATGAATGGCTCTCCCGCCTTTTTT 

— 1 ' ' ' 1 1 1 ■ 1111,11 — 1 1 ' 11 11 I I I I 2880 

CCTAGGATCGTCTTTTTATTCTGAACTAAGGGGGAATTTTAATGTTGACGATCTTTTACTTACCGAGAGGGCGGAAAAAA 



•BLV Promoter- 



jNarl Pvull 

GAGGGGGAATCATTTGTATGAAAGATCATGCCGACCTAGGCGCCGCCACCGCCCCGTAAACCAGACAGAGACGTCAGCTG 

— — 1,1 ' 11 1 11 I — +— I 'I I I 2960 

CTCCCCCTTAGTAAACATACTTTCTAGTACGGCTGGATCCGCGGCGGTGGCGGGGCATTTGGTCTGTCTCTGCAGTCGAC 



■BLV Promoter - 



Pvull 

i 

CCAGAAAAGCTGGTGACGGCAGCTGGTGGCTAGAATCCCCGTACCTCCCCAACTTCCCCTTTCCCGAAAAATCCACACCC 

1 I 1 1 1 1 i I ■ | ... i | i | , , i , ■ ■ | 3040 

GGTCTTTTCGACCACTGCCGTCGACCACCGATCTTAGGGGCATGGAGGGGTTGAAGGGGAAAGGGCTTTTTAGGTGTGGG 



•BLV Promoter- 



Nael 

TGAGCTGCTGACCTCACCTGCTGATAAATTAATAAAATGCCGGCCCTGTCGAGTTAGCGGCACCAGAAGCGTTCTTCTCC 

' ' 1 1 1 1 ■ I i ■ - I i i | . i . i . . i i | 1 1 I I 1 1 I I 3120 

ACTCGACGACTGGAGTGGACGACTATTTAATTATTTTACGGCCGGGACAGCTCAATCGCCGTGGTCTTCGCAAGAAGAGG 



BLV Promoter - 



Xhol Hindlll 

! ! 

TGAGACCCTCGTGCTCAGCTCTCGGTCCTGCCTCGAGAAGCTTGTTATCACAAGTTTGTACAAAAAAGCTGAACGAGAAA 

1 ' ' 111,1 1 1 1 1 1 ' 1 1 I I ' ' i I 3200 

ACTCTGGGAGCACGAGTCGAGAGCCAGGACGGAGCTCTTCGAACAATAGTGTTCAAACATGTTTTTTCGACTTGCTCTTT 



3. 



Gateway 1 



• BLV Promoter 

= attm- 
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SEQUENCE LISTING - TAX TBovine leukemia virus! 



LOCUS AAF97920 309 aa 

ACCESSION AAF97920 

NUCLEOTIDE SEQUENCE (SEQ ID NO:2): 

ATG GCA AGT GTT GTT GGT TGG GGG CCC CAC TCT CTA CAT GCC TGC CCG 
GCC CTG GTT TTG TCC AAT GAC GTC ACC ATC GAT GCC TGG TGC CCC CTC 
TGC GGG CCC CAT GAG CGA CTC CAA TTC GAA AGG ATC GAC ACC ACG CAC 
ACC TGC GAG ACC CAC CGT ATC ACC TGG ACC GCC GAT GGA CGA CCT TTC 
GGC CTC AAT GGA GCG CTG TTC CCT CGA CTG CAT GTC TCC AGA GAC CCG 
GCC CCA AGG GCC CGA CGA CTC TGG ATC AAC TGC CCC CTT CCG GCC GTT 
CGC GCT CAG CCC GGC CCG GTT TCA CTT TCC CCC TTC GAG CGG TCC CCC 
TTC CAG CCC TAC CAA TGC CAA TTG CCC TCG GCC TCT AGC GAC GGT TGC 
CCC GTC ATC GGG CAC GGC CTT CTT CCC TGG AAC AAC TTA GTA ACG CAT 
CCT TGT CCT CGG AAA GTC CTT ATA TTA AAT CAA ATG GCC AAT TTT TCC 
TTA CTC CCC CCC TTC AAT ACC CTC CTT GTG GAC CCC CTC CGG TTG TCC 
GTC TTT GCC CCA GAC ACC AGG GGA GCC ATA CGT TAT CTC TCC ACC CTT 
TTG ACG CTA TGC CCA GCT ACT TGT ATT CTA CCC CTC GGC GA GCC CTT 
CTC TCC TAA TGT CCC CAT ATG TCG CTT TCC CCG GGA CTC CAA TGA ACC 
CCC CCT TTC AGA ATT CGA GCT GCC CCT TAT CCA AAC GCC CGG CCT GTC 
TTG GTC TGT CCC CGC GAT CGA CCT ATT CCT AAC CGG CCC CCC TTC CCC 
ATG CGA CCG GTT ACA CGT ATG GTC CAG TCC TCA GGC CTT ACA GCG CTT 
CCT CCA TGA CCC TAC GCT AAC CTG GTC AGA ATT GGT TGC TAG CAG GAA 
ACT AAG ACT TGA TTC ACC CTT AAA ATT ACA ACT GTT AGA AAA TGA ATG 
GCT CTC CCG CCT TTT TTG 

PROTEIN SEQUENCE (SEQ ID NO:7): 

MASVVGWGPHSLHACPALVLSNDVTIDAWCPLCGPHERLQFERIDTTHTCETHRITW 

TADGRPFGLNGALFPRLHVSRDPAPRARRLWINCPLPAVRAQPGPVSLSPFERSPF 

QPYQCQLPSASSDGCPVIGHGLLPWNNLVTHPCPRKVLILNQMANFSLLPPFNTLLV 

DPLRLSVFAPDTRGAIRYLSTLLTLCPATCILPLGEPFSPNVPICRFPRDSNEPPLSEF 

ELPLIQTPGLSWSVPAIDLFLTGPPSPCDRLHVWSSPQALQRFLHDPTLTW 

SELVASRKLR LDSPLKLQLLENEWLSRLF 
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SEQUENCE LISTING HTLV-1 Promoter sequence (SEQ ID NO:4) 



1 TGACAATGAC CATGAGCCCC AAATATCCCC CGGGGGCTTA GAGCCTCTCA GTGAAAAACA 

61 TTTCCGTGAA ACAGAAGTCT GAGAAGGTCA GGGCCCAGAA TAAGGCTCTG ACGTCTCCCC 

121 CCGGAGGACA GCTCAGCACC AGCTCAGGCT AGGCCCTGAC GTGTCCCCCT AAAG AC AAAT 

181 CATAAGCTCA GACCTCCGGG AAGCCACCGG GAACCACCCA TTTCCTCCCC ATGTTTGTCA 

241 AGCCGTCCTC AGGCGTTGAC GACAACCCCT CACCTCAAAA AACTTTTCAT GGCACGCATA 

301 CGGCTCAATA AAATAACAGG AGTC TAT AAA AGCGTGGGGA CAGTTCAGGA GGG 



FIG. 4 



SEQUENCE LISTING ~ HTLV1 Tax Nucleic Acid (SEQ ID NO:3) and 



Protein sequence (SEQ ID NO:8) 

1 ATG GCC CAC TTC CCA GGG TTT GGA CAG AGT CTT CTT TTC GGA TAC 45 

1 Met Ala His Phe Pro Gly Phe Gly Gin Ser Leu Leu Phe Gly Tyr 15 

46 CCA GTC TAC GTG TTT GGA GAC TGT GTA CAA GGC GAC TGG TGC CCC 90 

16 Pro Val Tyr Val Phe Gly Asp Cys Val Gin Gly Asp Trp Cys Pro 30 

91 ATC TCT GGG GGA CTA TGT TCG GCC CGC CTA CAT CGT CAC GCC CTA 135 

31 lie Ser Gly Gly Leu Cys Ser Ala Arg Leu His Arg His Ala Leu 45 

136 CTG GCC ACC TGT CCA GAG CAT CAG ATC ACC TGG GAC CCC ATT GAT 180 

46 Leu Ala Thr Cys Pro Glu His Gin lie Thr Trp Asp Pro lie Asp 60 

181 GGA CGC GTT ATC GGC TCA GCT CTA CAG TTC CTT ATC CCT CGA CTC 22 5 

61 Gly Arg Val lie Gly Ser Ala Leu Gin Phe Leu lie Pro Arg Leu 75 

226 CCC TCC TTC CCC ACC CAG AGA ACC TCT AAG ACC CTC AAG GTC CTT 270 

7 6 Pro Ser Phe Pro Thr Gin Arg Thr Ser Lys Thr Leu Lys Val Leu 90 

271 ACC CCG CCA ATC ACT CAT ACA ACC CCC AAC ATT CCA CCC TCC TTC 315 

91 Thr Pro Pro lie Thr His Thr Thr Pro Asn lie Pro Pro Ser Phe 105 

316 CTC CAG GCC ATG CGC AAA TAC TCC CCC TTC CGA AAT GGA TAC ATG 360 

106 Leu Gin Ala Met Arg Lys Tyr Ser Pro Phe Arg Asn Gly Tyr Met 120 

361 GAA CCC ACC CTT GGG CAG CAC CTC CCA ACC CTG TCT TTT CCA GAC 405 

121 Glu Pro Thr Leu Gly Gin His Leu Pro Thr Leu Ser Phe Pro Asp 135 

406 CCC GGA CTC CGG CCC CAA AAC CTG TAC ACC CTC TGG GGA GGC TCC 450 

136 Pro Gly Leu Arg Pro Gin Asn Leu Tyr Thr Leu Trp Gly Gly Ser 150 

451 GTT GTC TGC ATG TAC CTC TAC CAG CTT TCC CCC CCC ATC ACC TGG 495 

151 Val Val Cys Met Tyr Leu Tyr Gin Leu Ser Pro Pro lie Thr Trp 165 

496 CCC CTC CTG CCC CAC GTG ATT TTT TGC CAC CCC GGC CAG CTC GGG 540 

166 Pro Leu Leu Pro His Val lie Phe Cys His Pro Gly Gin Leu Gly 180 

541 GCC TTC CTC ACC AAT GTT CCG TAC AAG CGA ATA GAA GAA CTC CTC 585 

181 Ala Phe Leu Thr Asn Val Pro Tyr Lys Arg lie Glu Glu Leu Leu 195 

586 TAT AAA ATT TCC CTT ACC ACA GGG GCC CTA ATA ATT CTA CCC GAA 630 

196 Tyr Lys lie Ser Leu Thr Thr Gly Ala Leu lie lie Leu Pro Glu 210 

631 GAC TGT TTG CCC ACC ACC CTT TTC CAG CCT GTT AGG GCA CCC GTC 675 

211 Asp Cys Leu Pro Thr Thr Leu Phe Gin Pro Val Arg Ala Pro Val 22 5 

676 ACG CTA ACA GCC TGG CAA AAC GGC CTC CTT CCG TTC CAC TCA ACC 72 0 

226 Thr Leu Thr Ala Trp Gin Asn Gly Leu Leu Pro Phe His Ser Thr 240 

721 CTC ACC ACT CCA GGC CTT ATT TGG ACA TTT ACC GAT GGC ACG CCT 765 

241 Leu Thr Thr Pro Gly Leu lie Trp Thr Phe Thr Asp Gly Thr Pro 255 
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766 ATG ATT TCC GGG CCC TGC CCT AAA GAT GGC CAG CCA TCT TTA GTA 810 

256 Met lie Ser Gly Pro Cys Pro Lys Asp Gly Gin Pro Ser Leu Val 270 

811 CTA CAG TCC TCC TCC TTT ATA TTT CAC AAA TTT CAA ACC AAG GCC 855 

271 Leu Gin Ser Ser Ser Phe lie Phe His Lys Phe Gin Thr Lys Ala 285 

856 TAC CAC CCC TCA TTT CTA CTC TCA CAC GGC CTC ATA CAG TAC TCT 900 

286 Tyr His Pro Ser Phe Leu Leu Ser His Gly Leu lie Gin Tyr Ser 3 00 

901 TCC TTT CAT AAT TTA CAT CTC CTG TTT GAA GAA TAC ACC AAC ATC 945 

301 Ser Phe His Asn Leu His Leu Leu Phe Glu Glu Tyr Thr Asn lie 315 

946 CCC ATT TCT CTA CTT TTT AAC GAA AAA GAG GCA GAT GAC AAT GAC 990 

316 Pro lie Ser Leu Leu Phe Asn Glu Lys Glu Ala Asp Asp Asn Asp 330 

991 CAT GAG CCC CAA ATA TCC CCC GGG GGC TTA GAG CCT CCC AGT GAA 1035 

331 His Glu Pro Gin He Ser Pro. Gly Gly Leu Glu Pro Pro Ser Glu 345 

1036 AAA CAT TTC CGC GAA ACA GAA GTC TGA 1070 

346 Lys His Phe Arg Glu Thr Glu Val TRM 354 




SEQUENCE LISTING - 



- HIV Promoter sequence (SEQ ID NO:5) 



1 CTGGAAGGGC TAATTTGGTC CCAAAGAAGA 

61 ACACAAGGCT ACTTCCCTGA TTGGCAGAAT 

121 CTGACCTTTG GATGGTGCTT CAAGCTAGTA 

181 AATGAAGGAG AGAACAACAG CTTGTTACAC 

241 GAGAAAGAAG TGTTAGTGTG GAGGTTTGAC 

301 GAGCTGCATC CGGAGTACTA CAAAGACTGC 

361 GCTGGGGACT TTCCAGGGAG GCGTGGCCTG 

421 GCTGCATATA AGCAGCTGCT TTTTGCCTGT 



CAAGAGATCC TTGATCTGTG GATCTACCAC 
TACACACCAG GGCCAGGGAT CAGATATCCA 
CCAGTTGAGC CAGAGAAGGT AGAAGAGGCC 
CCTATGAGCC TGCATGGGAT GGAGGACGCG 
AGCAAACTAG CATTTCATCA CATGGCCCGA 
TGACATCGAG CTTTCTACAA GGGACTTTCC 
GGCGGGACTG GGGAGTGGCG TCCCTCAGAT 
ACTGGG 
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SEQUENCE LISTING - HIV Tat nucleic acid (SEQ ID NO:6) and amino 

acid (SEQ ID NO:9) of HIV Tat 



1 


ATG 


GAG 


CCA 


GTA 


GAT 


CCT 


AAT 


CTA 


GAG 


CCC 


TGG 


AAG 


CAT 


CCA 


GGA 


45 


1 


Met 


Glu 


Pro 


Val 


Asp 


Pro 


Asn 


Leu 


Glu 


Pro 


Trp 


Lys 


His 


Pro 


Gly 


15 


46 


AGT 


CAG 


CCT 


AGG 


ACT 


GCT 


TGT 


AAC 


AAT 


TGC 


TAT 


TGT 


AAA 


AAG 


TGT 


90 


16 


Ser 


Gin 


Pro 


Arg 


Thr 


Ala 


Cys 


Asn 


Asn 


Cys 


Tyr 


Cys 


Lys 


Lys 


Cys 


30 


91 


TGC 


TTT 


CAT 


TGC 


TAC 


GCG 


TGT 


TTC 


ACA 


AGA 


AAA 


GGC 


TTA 


GGC 


ATC 


135 


31 


Cys 


Phe 


His 


Cys 


Tyr 


Ala 


Cys 


Phe 


Thr 


Arg 


Lys 


Gly 


Leu 


Gly 


He 


45 


136 


TCC 


TAT 


GGC 


AGG 


AAG 


AAG 


CGG 


AGA 


CAG 


CGA 


CGA 


AGA 


GCT 


CCT 


CAG 


180 


46 


Ser 


Tyr 


Gly 


Arg 


Lys 


Lys 


Arg 


Arg 


Gin 


Arg 


Arg 


Arg 


Ala 


Pro 


Gin 


60 


181 


GAC 


AGT 


CAG 


ACT 


CAT 


CAA 


GCT 


TCT 


CTA 


TCA 


AAG 


CAA 


CCC 


GCC 


TCC 


225 


61 


Asp 


Ser 


Gin 


Thr 


His 


Gin 


Ala 


Ser 


Leu 


Ser 


Lys 


Gin 


Pro 


Ala 


Ser 


75 


226 


CAG 


TCC 


CGA 


GGG 


GAC 


CCG 


ACA 


GGC 


CCG 


ACG 


GAA 


TCG 


AAG 


AAG 


AAG 


270 


76 


Gin 


Ser 


Arg 


Gly 


Asp 


Pro 


Thr 


Gly 


Pro 


Thr 


Glu 


Ser 


Lys 


Lys 


Lys 


90 


271 


GTG 


GAG 


AGA 


GAG 


ACA 


GAG 


ACA 


GAT 


CCG 


TTC 


GAT 


TAG 


306 






91 


Val 


Glu 


Arg 


Glu 


Thr 


Glu 


Thr 


Asp 


Pro 


Phe 


Asp 


TRM 


102 
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FIG. 10 

Friday, November 15, 2002 12:30 PM Page 1 

pLBC-BTaxW Map.MPD (1 > 7685) Site and Sequence 
Enzymes : 35 of 538 enzymes (Filtered) 

Settings : Circular, Certain Sites Only, Standard Genetic Code __ 

GAATTAATTCATACCAGATCACCGAAAACTGTCCTCCAAATGTGTCCCCCTCACACTCCCAAATTCGCGGGCTTCTGCCT 

I ■■■ i ■ ■■■ I I I I I ■ ■ I I ■ ■ ■ ■ I 80 

CTTAATTAAGTATGGTCTAGTGGCTTTTGACAGGAGGTTTACACAGGGGGAGTGTGAGGGTTTAAGCGCCCGAAGACGGA 

Sacll 

CTTAGACC ACTCT ACCCTATTCCCC AC ACTC ACCGGAGCC AAAGCCGCGGCCC TTCCGTTTCTTTGCTTTTGAAAGACCC 

■ ■ ■ ■ i ■ ■ ■ ■ I I | .... i ■ ... | | | | | 160 

GAATCTGGTGAGATGGGATAAGGGGTGTGAGTGGCCTCGGTTTCGGCGCCGGGAAGGCAAAGAAACGAAAACTTTCTGGG 



5' LTR 



l 5' LTR (MoMS- 

Nhel 

C ACCCGT AGGTGGCAAGCT AGCTT AAGT AACGCC ACTTTGC AAGGC ATGGAAAAATAC ATAACTGAGAAT AGAAAAGTTC 

I I 1 ■ I I I ' ■ ■ ■ I I I 240 

GTGGGC ATCC ACCGTTCGATCGAATTC ATTGCGGTGAAACGTTCCGTACCTTTTTATGT ATTGACTCTTATC TTTTCAAG 

5TTR 



5' LTR (MoMSV) 

EcoRV 

i 

i 

AGATCAAGGTCAGGAACAAAGAAACAGCTGAATACCAAACAGGATATCTGTGGTAAGCGGTTCCTGCCCCGGCTCAGGGC 

I ■ ■ ■ ■ I I ■ ■ ■ ■ 1 I ■ ■ ■ ■ i ■ ■ ■ ■ I ■■■■ i ■■■ ■ 1 320 

TCTAGTTCCAGTCCTTGTTTCTTTGTCGACTTATGGTTTGTCCTATAGACACCATTCGCCAAGGACGGGGCCGAGTCCCG 

5' LTR 

5' LTR (MoMSV) 

EcoRV 

CAAGAACAGATGAGACAGCTGAGTGATGGGCCAAACAGGATATCTGTGGTAAGCAGTTCCTGCCCCGGCTCGGGGCCAAG 

I I I I I I I I 400 

GTTCTTGTCTACTCTGTCGACTCACT ACCCGGTTTGTCCTATAGAC ACCATTCGTC AAGGACGGGGCCGAGCCCCGGTTC 

5' LTR 



5' LTR (MoMSV) 

AACAGATGGTCCCCAGATGCGGTCCAGCCCTCAGCAGTTTCTAGTGAATCATCAGATGTTTCC AGGGTGCCCCAAGGACC 

■ ■■ ■ i ■ ■ ■ ■ I ■ ■■■ i ■ ■■■ I 1 | ■ ■ ■ ■ I | I ■ ■ ■ ■ | 480 

ttgtctacc aggggtctacgccaggtcgggagtcgtcaaag atc acttagt agtctacaaaggtcccacggggttcctgg 

sTtr 



5' LTR (MoMSV) 

TGAAAATGACCCTGTACCTTATTTGAACTAACC AATC AGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCCGCTCTCCGAGC 

I I I I I I I I 560 

ACTTTTACTGGGAC ATGGAATAAACTTGATTGGTT AGTCAAGCGAAGAGCGAAGACAAGCGCGCGAAGGCGAGAGGCTCG 

5' LTR 



•5' LTR (MoMSV)- 
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FIG. 10 (cont) Page2 

pLBC-BTaxW Map.MPD (1 > 7685) Site and Sequence 

pad AscI pmal ^pnl 

TCAATAAAAGAGCCCACAACCCCTCACTCGGCGCGCCAGTCTTCCGATAGACTGCGTCGCCCGGGTACCCGTATTCCCAA 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ 1 ■ ■ ■ ■ h ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ 1 I I I ] 640 

AGTTATTTTCTCGGGTGTTGGGGAGTGAGCCGCGCGGTC AGAAGGCTATCTGACGCAGCGGGCCC ATGGGC ATAAGGGTT 

sTFr 



•5' LTR (MoMSV) - 



TAAAGCCTCTTGCTGTTTGCATCCGAATCGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAGTGATTGACTACCCAC 
i 1 | .. . . i .... | | .... ■ | ■ ... | .... i ■ ... | 720 

ATTTCGGAGAACGACAAACGTAGGCTTAGCACCAGAGCGACAAGGAACCCTCCCAGAGGAGACTCACTAACTGATGGGTG 

5' LTR 

5' LTR (MoMSV) 



GACGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATTTGGAGACCCCTGCCCAGGGACCACCGACCCACCACCGGGAGGTA 
I , , , ■ i ■ ■ ■ ■ I ) i 1 I 1 1 800 

CTGCCCCCAGAAAGTAAACCCCCGAGCAGGCCCTAAACCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGTGGCCCTCCAT 

5' LTR y 
— 5' LTR (MoMSV)— 1 



ppel 

AGCTGGCCAGCAACTTATCTGTGTCTGTCCGATTGTCTAGTGTCTATGTTTGATGTTATGCGCCTGCGTCTGTACTAGTT 
I ... ■ | | | | | | ■ , ■ ■ i . ■ ■ , | 880 

TCGACCGGTCGTTGAATAGACACAGACAGGCTAACAGATCACAGATACAAACTACAATACGCGGACGCAGAC ATGATCAA 
|^^^^^^^^^^^^^^^^^» Pkg Rgn 

L Extended Packaging Region 



AGCTAACTAGCTCTGTATCTGGCGGACCCGTGGTGGAACTGACGAGTTCTGAAC ACCCGGCCGCAACCCTGGGAGACGTC 

I I I l ■ ■ ■ ■ I I 1 ■ ■ ■ ■ i ■ ■ ■ ■ I 960 

TCGATTGATCGAGACATAGACCGCCTGGGCACCACCTTGACTGCTCAAGACTTGTGGGCCGGCGTTGGGACCCTCTGCAG 

^ Pkg Rgn 

Extended Packaging Region 



CCAGGGACTTTGGGGGCCGTTTTTGTGGCCCGACCTGAGGAAGGGAGTCGATGTGGAATCCGACCCCGTCAGGATATGTG 

■ ■ ■ ■ I ) I I 1 I | .... i .... | 1040 

GGTCCCTGAAACCCCCGGC AAAAACACCGGGCTGGACTCCTTCCCTCAGCTACACCTTAGGCTGGGGCAGTC CTATACAC 
^^^^^^^^^^^^^^^^^^^^ Pkg Rgn ^ 

Extended Packaging Region 



GTTCTGGTAGGAGACGAGAACCTAAAACAGTTCCCGCCTCCGTCTGAATTTTTGCTTTCGGTTTGGAACCGAAGCCGCGC 
■ ... I ■ ... i ■ ... I | ■ ■ ■ | | | | | | 120 

CAAGACCATCCTCTGCTCTTGGATTTTGTCAAGGGCGGAGGCAGACTTAAAAACGAAAGCCAAACCTTGGCTTCGGCGCG 

Pkg Rgn ^ ^ — 

Extended Packaging Region 



pstl |Pstl 

GTCTTGTCTGCTGCAGCGCTGC AGCATCGTTCTGTGTTGTCTCTGTCTGACTGTGTTTCTGTATTTGTCTGA AAATTAGG 

• ■ ■ ■ I I ... | , ... i .... | | | i | 1200 

CAGAACAGACGACGTCGCGACGTCGTAGCAAGACACAACAGAGACAGACTGACACAAAGACATAAACAGACTTTTAATCC 

Pkg Rgn ^^^^^^^^^^^^^^^^^^^^^ 

Extended Packaging Region 
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FIG. 10(cont) 

pLBC-BTaxW Map.MPD (1 > 7685) Site and Sequence m 

GCCAGACTGTTACCACTCCCTTAAGTTTGACCTTAGGTCACTGGAAAGATGTCGAGCGGATCGCTCACAACCAGTCGGTA 
I ... i ■ ... I ... ■ ( | j | | , , , , , , , , , | 1280 

CGGTCTGACAATGGTGAGGGAATTCAAACTGGAATCCAGTGACCTTTCTACAGCTCGCCTAGCGAGTGTTGGTCAGCCAT 
^^^^^^^—^^—-^—^^——m Pkg Rgn ^ mmmh^^mbhb 



• Extended Packaging Region • 



r 



tl 



GATGTCAAGAAGAGACGTTGGGTTACCTTCTGCTCTGC AGAATGGCCAACCTTTAACGTCGGATGGCCGCGAGACGGCAC 

— - — I 1 1 I I ■ — | , ... i | i ■ — — h I 1360 

CTACAGTTCTTCTCTGCAACCCAATGGAAGACGAGACGTCTTACCGGTTGGAAATTGCAGCCTACCGGCGCTCTGCCGTG 

i^^^^^^^^— ■ Pkg Rgn ^ ^ ^^^^^—^^m— 



■ Extended Packaging Region 



CTTTAACCGAGACCTCATCACCCAGGTTAAGATCAAGGTCTTTTCACCTGGCCCGCATGGACACCCAGACCAGGTCCCCT 

' 1 ' ' 1 ' 1 | ■ ■ ■ ■ | | | ■ i ■ ■ i | | 1 440 

GAAATTGGCTCTGGAGTAGTGGGTCCAATTCTAGTTCCAGAAAAGTGGACCGGGCGTACCTGTGGGTCTGGTCCAGGGGA 

Pkg Rgn ^ ^ 



■ Extended Packaging Region ■ 



ACATCGTGACCTGGGAAGCCTTGGCTTTTGACCCCCCTCCCTGGGTCAAGCCCTTTGTACACCCTAAGCCTCCGCCTCCT 

1 ' ' ' 1 I 1 1 1 1 I 1 I ■ ■ ■ ' 1 I 1 1 1 1 I 1520 

TGTAGCACTGGACCCTTCGGAACCGAAAACTGGGGGGAGGGACCCAGTTCGGGAAACATGTGGGATTCGGAGGCGGAGGA 

Pkg Rgn ^—^^^—^——m^^m—^^—^ 



• Extended Packaging Region • 



CTTCCTCCATCCGCCCCGTCTCTCCCCCTTGAACCTCCTCGTTCGACCCCGCCTCGATCCTCCCTTTATCCAGCCCTCAC 

' 1 1 ' I I 1 ■ ■ t I I ■ ■ ■ ■ | | 1600 

GAAGGAGGTAGGCGGGGCAGAGAGGGGGAACTTGGAGGAGCAAGCTGGGGCGGAGCTAGGAGGGAAATAGGTCGGGAGTG 
^^^■"■^^^—i^^— Pkg Rgn ^ i 



• Extended Packaging Region 



JMarl EcoRI Bell p c ll 

TCCTTCTCTAGGCGCCGGAATTCCGATCTGATCAAGAGACAGGATGAGGGAGCTTGTATATCCATTTTCGGATCTGATCA 
■ ■ ■ ■ I ) I I i ■ ... i ■ ... | ... ■ 1 ■ ■ ■■ i ■ ■ ■ ■ I 1680 

AGGAAGAGATCCGCGGCCTTAAGGCTAGACTAGTTCTCTGTCCTACTCCCTCGAACATATAGGTAAAAGCCTAGACTAGT 



^™ Pkg Rgn ™H P 

■ Extended Packaging* CZ 



r 



col 



GCACGTGTTGACAATTAATCATCGGCATAGTATATCGGC ATAGTATAATACGACAAGGTGAGGAACTAAACCATGGCC AA 

' 1 I I I 1 I I ■ 1 1 1 I I "H— « H- 1760 

CGTGCACAACTGTTAATTAGTAGCCGTATCATATAGCCGTATCATATTATGCTGTTCCACTCCTTGATTTGGTACCGGTT 

EM7 y Met Ala Lys 
EM7 promoter ' L BLAST - 



GCCTTTGTCTCAAGAAGAATCC ACCCTCATTGAAAGAGC AACGGCTACAATCAACAGCATCCCCATCTCTGAAGACTACA 

I I " 1 """! I I ■ ■ ■ ■ I ■ ■ ■ I 1 1 1 ■ i ■ ■ 1 1 I 1840 

CGGAAACAGAGTTCTTCTTAGGTGGGAGTAACTTTCTCGTTGCCGATGTTAGTTGTCGTAGGGGTAGAGACTTCTGATGT 

Pro Leu Ser Gin Glu Glu Ser Thr Leu He Glu Arg Ala Thr Ala Thr He Asn Ser He Pro He Ser Glu Asp Tyr 
BLAST — — 
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pLBC-BTaxW Map.MPD (1 > 7685) Site and Sequence 

GCGTCGCCAGCGCAGCTCTCTCTAGCGACGGCCGCATCTTCACTGGTGTCAATGTATATCATTTTACTGGGGGACCTTGT 

■ ■ ■ ■ | ■ ... i ■ ... i ... ■ | ■ ... i ... . | | | | ■ ■ ■ ■ | 1920 

CGCAGCGGTCGCGTCGAGAGAGATCGCTGCCGGCGTAGAAGTGACC ACAGTTACATATAGTAAAATGACCCCCTGGAACA 

Ser Vol Ala Ser Ala Ala Leu Ser Ser Asp Gly Arg He Phe Thr Gly Val Asn Val Tyr His Phe Thr Gly Gly Pro Cys 
BLAST 



^Jrul pail 

GCAGAACTCGTGGTGCTGGGCACTGCTGCTGCTGCGGCAGCTGGCAACCTGACTTGTATCGTCGCGATCGGAAATGAGAA 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ i I I I 1 I I 2000 

CGTCTTGAGCACCACGACCCGTGACGACGACGACGCCGTCGACCGTTGGACTGAACATAGCAGCGCTAGCCTTTACTCTT 

Ala Glu Leu Val Val Leu Gly Thr Ala Ala Ala Ala Ala Ala Gly Asn Leu Thr Cys He Val Ala He Gly Asn Glu Asn 
BLAST 



Sail 



CAGGGGCATCTTGAGCCCCTGCGGACGGTGTCGACAGGTGCTTCTCGATCTGCATCCTGGGATCAAAGCGATAGTGAAGG 

I I ■ ■ ■ ■ I ... | ... m .... | .... i ■ ... | | | 2080 

GTCCCCGTAGAACTCGGGGACGCCTGCC ACAGCTGTCCACGAAGAGCTAGACGTAGGACCCTAGTTTCGCTATCACTTCC 

Arg Gly He Leu Ser Pro Cys Gly Arg Cys Arg Gin Val Leu Leu Asp Leu His Pro Gly He Lys Ala He Val Lys 
BLAST 



ACAGTGATGGACAGCCGACGGC AGTTGGGATTCGTGAATTGCTGCCCTCTGGTTATGTGTGGGAGGGCTAAGCACTTCGT 
■ ... I .... i ■ ... I | | ■ . ■ ■ | | | | 2160 

TGTC ACTACCTGTCGGCTGCCGTCAACCCTAAGCACTTAACGACGGGAGACCAATACAC ACCCTCCCGATTCGTGAAGCA 

Asp Ser Asp Gly Gin Pro Thr Ala Val Gly He Arg Glu Leu Leu Pro Ser Gly Tyr Val Trp Glu Gly • . 
BLAST 1 

GGCCGAGGAGCAGGACTGACACGTGCTACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCG 
I | | | | ■ ■ ■ ■ | | | 2240 

CCGGCTCCTCGTCCTGACTGTGCACGATGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGC 



TTTTCCGGGACGCCGATCCGGCCATTAGCCATATTATTCATTGGTTATATAGCATAAATCAATATTGGCTATTGGCCATT 

I I ■ ■ ■ I ■ 1 1 I 1 1 1 1 1 'l I 2320 

AAAAGGCCCTGCGGCTAGGCCGGTAATCGGTATAATAAGTAACCAATATATCGTATTTAGTTATAACCGATAACCGGTAA 

CMV Pro 

hCMV Promoter 



GC ATACGTTGTATCC ATATC AT AATATGTACATTTATATTGGCTCATGTCCAACATTACCGCCATGTTGAC ATTGATTAT 

' * * * 1 * 1 ' * I I I " ■ I I I I ■ ■ ■ ■ i ■ ■ ■ ■ I 2400 

CGTATGCAACATAGGTATAGTATTATACATGTAAATATAACCGAGTACAGGTTGTAATGGCGGTACAACTGT AACTAATA 

CMV Pro 

hCMV Promoter 



Spel 

TGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACG 

I I I ■ ■ ■ ■ i I I I I 2480 

ACTGATC AATAATTATC ATTAGTTAATGCCCCAGTAATC AAGTATCGGGTATATACCTCAAGGCGCAATGTATTGAATGC 

CMV Pro 



hCMV Promoter 
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pLBC-BTaxW Map.MPD (1 > 7685) Site and Sequence __ 

GTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCC 

■ ■ ■ ■ I I 1 ■ ■ ■ I ■ ■ ■ ■ I ■ ■ ■ ■ 1 ■ ■ ■ ■ 1 I I 2560 

CATTTACCGGGCGGACCGACTGGCGGGTTGCTGGGGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGG 

CMV Pro 



• hCMV Promoter • 



|Ndel 

AATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGC AGTAC ATCAAGTGTATCATA 
I ■ ■ ■ 1 I | , ... i ■ ... | ... ■ | ■■■■■■■■ i ■ ■■■ i .... | 2640 

TTATCCCTGAAAGGTAACTGCAGTTACCC ACCTCATAAATGCCATTTGACGGGTGAACCGTC ATGTAGTTCAC ATAGTAT 

CMV Pro 

hCMV Promoter 



TGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCC AGTAC ATGACCTTATGGGAC 
■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ 1 | ■ ■ ■ ■ t | .... i ■ ... i ■ ... i .... | i 2720 

ACGGTTCATGCGGGGGATAACTGC AGTTACTGCC ATTTACCGGGCGGACCGTAATACGGGTCATGTACTGGAATACCCTG 

CMV Pro 

hCMV Promoter 



Ncol 

TTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCG 

) I — I I 1 ■ ■ ■ I I ■ ■ ■ ■ I I 2800 

AAAGGATGAACCGTCATGTAGATGCATAATCAGTAGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGC 

CMV Pro 

hCMV Promoter 



TGGATAGCGGTTTGACTC ACGGGGATTTCCAAGTCTCCACCCC ATTGACGTC AATGGGAGTTTGTTTTGGC ACC AAAATC 

I I I 1 1 ~H 1 H- 1 1 1 2880 

ACCTATCGCCAAACTGAGTGCCCCTAAAGGTTC AGAGGTGGGGTAACTGC AGTTACCCTC AAACAAAACCGTGGTTTTAG 

CMV Pro 

hCMV Promoter 



AACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCATGTACGGTGGGAGGTCTAT 

■ ■ ■ ■ I I I i ! hi — ■ ■ ■ I — ■ H- — h — — H- 2960 

TTGCCCTGAAAGGTTTTACAGC ATTGTTGAGGCGGGGTAACTGCGTTTACCCGCCATCCGTACATGCCACCCTCCAGATA 

CMV Pro 

hCMV Promoter 



Sad 

ATAAGC AGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCG 

■ ■ ■ ■ i ■ ■ ■ ■ 1 ' ' ' ■ ' ■ ■ 1 ' I I I ■ ■ ■ ■ I | .... m ... | .... i .... | 3040 

TATTCGTCTCGAGCAAATCACTTGGCAGTCTAGCGGACCTCTGCGGTAGGTGCGACAAAACTGGAGGTATCTTCTGTGGC 

CMV Pro 



hCMV Promoter 
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,Sacll ^Hindlll ^sJcol 

GGACCGATCCAGCCTCCGCGGCCCCAAGCTTGTTATCACAAGTTTGTACAAAAAAGCAGGCTCCCGCCGCCACCATGGCA 

11 11 ' 11 1 1 I ■ ... | ■ ... i .... | | , , , ■ | ■ ■ , ■ | ,,,,[,,■, | 312 o 

CCTGGCTAGGTCGGAGGCGCCGGGGTTCGAACAATAGTGTTC AAACATGTTTTTTCGTCCGAGGGCGGCGGTGGTACCGT 



CMV Pro 



• hCMV Promoter ' 



attB1 Met Ala 



-attB1 1 L BTax- 



Apal Clal 

! I 

AGTGTTGTTGGTTGGGGGCCCCACTCTCTACATGCCTGCCCGGCCCTGGTTTTGTCCAATGATGTCACCATCGATGCCTG 
I | .... i .... | .... i .... | .... i .... | | i | 3200 

TCACAACAACCAACCCCCGGGGTGAGAGATGTACGGACGGGCCGGGACCAAAACAGGTTACTACAGTGGTAGCTACGGAC 

Ser Val Val Gly Trp Gly Pro His Ser Leu His Ala Cys Pro Ala Leu Val Leu Ser Asn Asp Val Thr He Asp Ala Trp 
BTax 

Apal 

GTGCCCCCTCTGCGGGCCCCATGAGCGACTCCAATTCGAAAGGATCGACACCACGCTCACCTGCGAGACCC ACCGTATC A 

■ ■ ■ ■ i ■ ■ ■ ■ I I | . ... i .... | .... i . ... | | | | 3280 

CACGGGGGAGACGCCCGGGGTACTCGCTGAGGTTAAGCTTTCCTAGCTGTGGTGCGAGTGGACGCTCTGGGTGGCATAGT 

Cys Pro Leu Cys Gly Pro His Glu Arg Leu Gin Phe Glu Arg He Asp Thr Thr Leu Thr Cys Glu Thr His Arg He 
BTax 



ACTGGACCGCCGATGGACGACCTTGCGGCCTCAATGGAACGTTGTTCCCTCGACTGCATGTCTCCGAGACCCGCCCCCAA 
I | | | | I ■ ■ ■ ■ | ■ ... i ■ ... | 3350 

TGACCTGGCGGCTACCTGCTGGAACGCCGGAGTTACCTTGCAACAAGGGAGCTGACGTACAGAGGCTCTGGGCGGGGGTT 

Asn Trp Thr Ala Asp Gly Arg Pro Cys Gly Leu Asn Gly Thr Leu Phe Pro Arg Leu His Val Ser Glu Thr Arg Pro Gin 
BTax 



jApal 

GGGCCCCGACGACTCTGGATCAACTGCCCCCTTCCGGCCGTTCGCGCTCAGCCCGGCCCGGTTTCACTTTCCCCCTTCGA 
■ ... I ... ■ i ■ ■ ■ | i | | | | 3440 

CCCGGGGCTGCTGAGACCTAGT TGACGGGGGAAGGCCGGCAAGCGCGAGTCGGGCCGGGCCAAAGTGAAAGGGGGAAGCT 

Gly Pro Arg Arg Leu Trp He Asn Cys Pro Leu Pro Ala Val Arg Ala Gin Pro Gly Pro Val Ser Leu Ser Pro Phe Glu 
BTax ; 

GCGGTCCCCCTTCC AGCCCTAC CAATGCCAATTGCCCTCGGCCTCTAGCGACGGTTGCCCCATTATCGGGCACGGCCTTC 

I I I I I I I I 3520 

CGCCAGGGGGAAGGTCGGGATGGTTACGGTTAACGGGAGCCGGAGATCGCTGCCAACGGGGTAATAGCCCGTGCCGGAAG 

Arg Ser Pro Phe Gin Pro Tyr Gin Cys Gin Leu Pro Ser Ala Ser Ser Asp Gly Cys Pro He He Gly His Gly Leu 
BTax 



TTCCCTGGAACAACTTAGTAACGCATCCTGTCCTCAGAAAAGTCCTTATATTAAATCAAATGGCCAATTTTTCCTTACTC 

I I | | ■ ■ 1 ■ I I | .... t .... | 3600 

AAGGGACCTTGTTGAATCATTGCGTAGGACAGGAGTCTTTTCAGGAATATAATTTAGTTTACCGGTTAAAAAGGAATGAG 

Leu Pro Trp Asn Asn Leu Val Thr His Pro Val Leu Arg Lys Val Leu He Leu Asn Gin Met Ala Asn Phe Ser Leu Leu 
BTax 



CCCTCCTTCGATACCCTCCTTGTGGACCCCCTCCGGCTGTCCGTCTTTGCCCCAGACACCAGGGGAGCC ATACGTTATCT 

1 1 1 1 ' — 1 — ■ »~ 1 ■ ■ . ■ i . ■ ■ — I I 3680 

GGGAGGAAGCTATGGGAGGAAC ACCTGGGGGAGGCCGACAGGCAGAAACGGGGTCTGTGGTCCCCTCGGTATGCAATAGA 



Pro Ser Phe Asp Thr Leu Leu Val Asp Pro Leu Arg Leu Ser Val Phe Ala Pro Asp Thr Arg Gly Ala He Arg Tyr Leu 
BTax 
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r 



del 



CTCCACCCTTTTGACGCTATGCCCGGCTACTTGTATTCTACCCCTAGGCGAGCCCTTCTCTCCTAATGTCCCCATATGCC 

I 1 1 I 1 H~ — *h ~H I I 1 3760 

GAGGTGGGAAAACTGCGATACGGGCCGATGAACATAAGATGGGGATCCGCTCGGGAAGAGAGGATTACAGGGGTATACGG 

Ser Thr Leu Leu Thr Leu Cys Pro Ala Thr Cys He Leu Pro Leu Gly Glu Pro Phe Ser Pro Asn Val Pro He Cys 
BTax 



Smal EcoRI 



GCTTTCCCCGGGACTCCAATGAACCCCCCCTTTCAGAATTCGAGCTGCCCCTTATCCAAACGCCCGGCCTGTCTTGGTCT 
■ ... I ■ ... i .... I | | i | . ■ ■ ■ i . ■ . ■ | | 3840 

CGAAAGGGGCCCTGAGGTTACTTGGGGGGGAAAGTCTTAAGCTCGACGGGGAATAGGTTTGCGGGCCGGACAGAACCAGA 

Arg Phe Pro Arg Asp Ser Asn Glu Pro Pro Leu Ser Glu Phe Glu Leu Pro Leu He Gin Thr Pro Gly Leu Ser Trp Ser 
BTax 



Pvul 

i 

GTCCCCGCGATCGACCTATTCCTAACCGGTCCCCCTTCCCC ATGCGACCGGTTACACGTATGGTCC AGTCCTC AGGCCTT 
I I I | .... i .... | i | .... i .... | 3920 

CAGGGGCGCTAGCTGGATAAGGATTGGCCAGGGGGAAGGGGTACGCTGGCCAATGTGCATACCAGGTCAGGAGTCCGGAA 

Vol Pro Ala He Asp Leu Phe Leu Thr Gly Pro Pro Ser Pro Cys Asp Arg Leu His Val Trp Ser Ser Pro Gin Ala Leu 
BTax 



.BspHI Nhel 

ACAGCGCTTCCTTCATGACCCT ACGCTAACCTGGTCCGAATTAGTTGCTAGCAGAAAAATAAGACTTGATTCCCCCTTAA 

1 ■ 1 ■ ' 1 ' 1 1 I I I I I I ■ ■ ■ ■ I I 4000 

TGTCGCGAAGGAAGTACTGGGATGCGATTGGACCAGGCTTAATCAACGATCGTCTTTTTATTCTGAACTAAGGGGGAATT 

Gin Arg Phe Leu His Asp Pro Thr Leu Thr Trp Ser Glu Leu Val Ala Ser Arg Lys He Arg Leu Asp Ser Pro Leu 
BTax 



Clal 



AATTACAACTGCTAGAAAATGAATGGCTCTCCCGCCTTTTTTGAGACCCAGCTTTCTTGTACAAAGTGGTGATAACATCG 

■ ■ ■ ■ i ■ ■ ■ ■ I I I ■ ■ ■ ■ I I I I ■ ■ ■ ■ I 4080 

TTAATGTTGACGATCTTTTACTTACCGAGAGGGCGGAAAAAACTCTGGGTCGAAAGAACATGTTTCACCACTATTGTAGC 



Lys Leu Gin Leu Leu Glu Asn Glu Trp Leu Ser Arg Leu Phe 
BTax 



-attB2- 



ATAATCAACCTCTGGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTCCTTTTACGCTATGTGGA 
I ■ ■ ■ ■ | . ■ . | \ ■ . ■ ■ | | | ,,,,,,,,, | 4150 

TATTAGTTGGAGACCTAATGTTTTAAACACTTTCTAACTGACCATAAGAATTGATACAACGAGGAAAATGCGATACACCT 

H- — ^ — WPRE mmm—mmm — — — 

C WPRE 

TACGCTGCTTTAATGCCTTTGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTT 

i H- — I I I *h H I | .... i .... | 4240 

ATGCGACGAAATTACGGAAAC ATAGTACGATAACGAAGGGC ATACCGAAAGTAAAAGAGGAGGAACATATTTAGGACCAA 
— — — — WPRE — — ^ — — 
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GCTGTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGTGCACTGTGTTTGCTGACGC AACCCCCA 

I ■ ■ 1 ■ ■ 1 1 I I I h~ I — ■ — ~t 1 ■ I — — + 4320 

CGAC AGAGAAATACTCCTC AAC ACCGGGCAACAGTCCGTTGCACCGCACCACACGTGACACAAACGACTGCGTTGGGGGT 

WPRE — ■■ — ^ — — 
WPRE 



CTGGTTGGGGCATTGCCACCACCTGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGCGGAACTC 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ I ■ ... | ... ■ | ■ ■ ■ ■ i i ■ ■ ■ ■ | z|400 

GACCAACCCCGTAACGGTGGTGGACAGTCGAGGAAAGGCCCTGAAAGCGAAAGGGGGAGGGATAACGGTGCCGCCTTGAG 

WPRE ^ 
WPRE 

ATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGACAATTCCGTGGTGTTGTCGGGGAAATC 
I I | ■ ■ , ■ i ■ ■ ■ ■ | | ■ ■ ■ ■ | | 4430 

TAGCGGCGGACGGAACGGGCGACGACCTGTCCCCGAGCCGACAACCCGTGACTGTTAAGGCACCACAACAGCCCCTTTAG 
^ WPRE ^ 

WPRE 



ATCGTCCTTTCCTTGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGGCCC 

■ ■ ■ ■ I I I I I I I I 4560 

TAGC AGGAAAGGAACCGACGAGCGGACACAACGGTGGACCTAAGACGCGCCCTGCAGGAAGACGATGCAGGGAAGCCGGG 

— — — — WPRE 
WPRE 



pad I Nael 

TCAATCCAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTCTTCCGCGTCTTCGCCTTCGCCCTCAGACG 

I ■ ■ ■ ■ 1 I ■ ■ ■ | | i | ) 4640 

AGTTAGGTCGCCTGGAAGGAAGGGCGCCGGACGACGGCCGAGACGCCGGAGAAGGCGCAGAAGCGGAAGCGGGAGTCTGC 

WPRE ^ 
WPRE 



Clal 

AGTCGGATCTCCCTTTGGGCCGCCTCCCCGCCTGATCGATAAAATAAAAGATTTTATTTAGTCTCCAGAAAAAGGGGGGA 
1 ■ ... i ■ ... I ■ ... i ■ ... I ■ ... i ■ ... I | | ■ | | 4720 

TCAGCCTAGAGGGAAACCCGGCGGAGGGGCGGACTAGCTATTTTATTTTCTAAAATAAATCAGAGGTCTTTTTCCCCCCT 

WPRE | [ 

WPRE 1 t 



Nhel 

i 

ATGAAAGACCCCACCTGTAGGTTTGGCAAGCTAGCTTAAGTAACGCCATTTTGCAAGGCATGGAAAAATACATAACTGAG 

* 1 1 1 ' ' * 1 1 I I 1 1 ■ 1 I I ■ ... | ■ ... i , ... | | i 48OO 

TACTTTCTGGGGTGGACATCCAAACCGTTCGATCGAATTCATTGCGGTAAAACGTTCCGTACCTTTTTATGTATTGACTC 

3TTR 



3' LTR (MoMLV) 
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P 



coRV 



AATAGAGAAGTTCAGATCAAGGTCAGGAACAGATGGAACAGCTGAATATGGGCCAAACAGGATATCTGTGGTAAGCAGTT 

, 1 1 , — i 1 1 1 1 1 1 1- 1 -h 1 — h ' — h q880 

TTATCTCTTCAAGTCTAGTTCCAGTCCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAA 

Fltr 



•3 1 LTR (MoMLV) ■ 



P 



coRV 



CCTGCCCCGGCTCAGGGCCAAGAACAGATGGAACAGCTGAATATGGGCCAAACAGGATATCTGTGGTAAGCAGTTCCTGC 
1 — i . , | , i — -m- — -t- — i . | . . . — i — i — h 1 — i 1 — - — i — ~ — | I -h h 4960 

GGACGGGGCCGAGTCCCGGTTCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACG 



•3' LTR (MoMLV) ■ 



bal 



CCCGGCTCAGGGCCAAGAAC AGATGGTCCCCAGATGCGGTCC AGCCCTCAGCAGTTTCTAGAGAACC ATCAG ATGTTTCC 

I I I I ■ ■ I I I I 5040 

GGGCCGAGTCCCGGTTCTTGTC TACC AGGGGTCTACGCCAGGTCGGGAGTCGTC AAAGATCTCTTGGTAGTCTAC AAAGG 



3' LTR (MoMLV) 

AGGGTGCCCCAAGGACCTGAAATGACCCTGTGCCTTATTTGAACTAACCAATCAGTTCGCTTCTCGCTTCTGTTCGCGCG 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I I ■ ■ ■ ■ I I 5120 

TCCC ACGGGGTTCCTGGACTTTACTGGGACACGGAATAAACTTGATTGGTTAGTCAAGCGAAGAGCGAAGAC AAGCGCGC 

3TFR 



3' LTR (MoMLV) 

pad Narl pmal 

CTTCTGCTCCCCGAGCTC AATAAAAGAGCCC ACAACCCCTCACTCGGGGCGCCAGTCCTCCGATTGACTGAGTCGCCCGG 

I I I ■ ■ ■ 1 I ■ ■ ■ ■ 1 I I 1 " 1 I 5200 

GAAGACGAGGGGCTCGAGTTATTTTCTCGGGTGTTGGGGAGTGAGCCCCGCGGTCAGGAGGCTAACTGACTC AGCGGGCC 

3TTR 



•3' LTR (MoMLV)- 



Kpi 



nl 



GTACCCGTGTATCCAATAAACCCTCTTGCAGTTGCATCCGACTTGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAG 
1 I I I ■ ■ ■ ■ i I I ■ ■ ■ ■ I 5280 

CATGGGCACATAGGTTATTTGGGAGAACGTCAACGTAGGCTGAACACCAGAGCGACAAGGAACCCTCCCAGAGGAGACTC 

__ 



3' LTR (MoMLV) 

TGATTGACTACCCGTCAGCGGGGGTCTTTCATTTTTCCATTGGGGGCTCGTCCGGGATCGGGAGACCCCTGCCCAGGGAC 
I | .... i .... | | | | | | 5360 

ACTAACTGATGGGCAGTCGCCCCCAGAAAGTAAAAAGGTAACCCCCGAGCAGGCCCTAGCCCTCTGGGGACGGGTCCCTG 

3' LTR ""^ 
3' LTR (MoMLV) 1 
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CACCGACCCACCACCGGGAGGTAAGCTGGCTGCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACAC ATGCAGCT 

I ■ I — I I I I — ■ ■ ■ I I 5440 

GTGGCTGGGTGGTGGCCCTCCATTCGACCGACGGAGCGCGCAAAGCCACTACTGCCACTTTTGGAGACTGTGTACGTCGA 

CCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCG 
I ■ ■ ■ ■ I | ■ ... i .... i ... ■ | | | | 5520 

GGGCCTCTGCCAGTGTCGAACAGACATTCGCCTACGGCCCTCGTCTGTTCGGGCAGTCCCGCGCAGTCGCCC ACAACCGC 

GGTGTCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGAGCAGA 
I ■ ■ ■ ■ I | ■ ... i .... | | | | , , . , | 5600 

CCACAGCCCCGCGTCGGTACTGGGTCAGTGCATCGCTATCGCCTCACATATGACCGAATTGATACGCCGTAGTCTCGTCT 

Ndel 

TTGTACTGAGAGTGCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCC 

■ ■ ■ ■ I ■ ■ — I — ■ ■ I I 1 ... . i .... | | | 5680 

AACATGACTCTCACGTGGTATACGCCACACTTTATGGCGTGTCTACGCATTCCTCTTTTATGGCGTAGTCCGCGAGAAGG 

GCTTCCTCGCTC ACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACG 

■ ■ ■ ■ i ■ ■ ■ ■ I I I | .... i .... ( | | | 576O 

CGAAGGAGCGAGTGACTGAGCGACGCGAGCCAGCAAGCCGACGCCGCTCGCCATAGTCGAGTGAGTTTCCGCCATTATGC 

GTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGG 
I I | ■ ... i .... | ■ ■ | i | i 58^0 

CAATAGGTGTCTTAGTCCCCTATTGCGTCCTTTCTTGTAC ACTCGTTTTCCGGTCGTTTTCCGGTCCTTGGC ATTTTTCC 

CCGCGTTGCTGGCGTTTTTCC ATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTC AAGTCAGAGGTGGCGA 

■ I I ■ ■ ■ ■ I I I I I | 5920 

GGCGCAACGACCGCAAAAAGGTATCCGAGGCGGGGGGACTGCTCGTAGTGTTTTTAGCTGCGAGTTCAGTCTCCACCGCT 

AACCCGACAGGACTATAAAGATACC AGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACC CTGCCGCT 

■ ■ ■ ■ i ■ ■ ■ ■ I ' ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ I I ■ ■ ■ ■ | ■ ... i ■ ... | ■ ■ ■ ■ i | 6000 

TTGGGCTGTCCTGATATTTCTATGGTCCGCAAAGGGGGACCTTCGAGGGAGCACGCGAGAGGAC AAGGCTGGGACGGCGA 

TACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGG 

■ ■ ■ ■ ' ' ' ■ ' I ■ ■ ■ ' i ' ■ ■ ' I | .... i .... | ... ■ | | | . ■ , ■ | 6080 

ATGGCCTATGGACAGGCGGAAAGAGGGAAGCCCTTCGCACCGCGAAAGAGTATCGAGTGCGACATCCATAGAGTCAAGCC 

TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTAT 

I I I 1 I I 1 ■ ■ ■ ■ I 6160 

ACATCCAGCAAGCGAGGTTCGACCCGACACACGTGCTTGGGGGGCAAGTCGGGCTGGCGACGCGGAATAGGCCATTGATA 

CGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTA 
1 .... I . . . | | .... i 1 i .... t .... | I I 6240 

GCAGAACTCAGGTTGGGCCATTCTGTGCTGAATAGCGGTGACCGTCGTCGGTGACCATTGTCCTAATCGTCTCGCTCCAT 

TGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTC 

■ ■ ■ ■ I ■ ... | ... ■ | | i | | | 6320 

AC ATCCGCCACGATGTCTCAAGAACTTCACCACCGGATTGATGCCGATGTGATCTTCCTGTCATAAACCATAGACGCGAG 



TGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTT 
| | 1 ■ ■ ■ ■ | | | ■ . ■ ■ | | 

ACGACTTCGGTC AATGGAAGCCTTTTTCTC AACCATCGAGAACTAGGCCGTTTGTTTGGTGGCGACCATCGCCACCAAAA 



6400 
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TTTGTTTGCAAGCAGC AGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGC 
I I I ■ ... | ... ■ | | | | 0430 

AAACAAACGTTCGTCGTCTAATGCGCGTCTTTTTTTCCTAGAGTTCTTCTAGGAAACTAGAAAAGATGCCCCAGACTGCG 



BspHI 



Dral 



TCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTC ATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATT 
I | .... i .... | .... i .... i ... | i , ... i ■■■■ | , ... i ■ ... | 6560 

AGTC ACCTTGCTTTTGAGTGCAATTCCCTAAAACC AGTACTCTAATAGTTTTTCCTAGAAGTGGATCTAGGAAAATTTAA 



Dral 

I 

AAAAATGAAGTTTTAAATC AATCTAAAGTATATATGAGTAAACTTGGTCTGAC AGTTACC AATGCTTAATCAGTGAGGCA 

1 1 I I ■ 1 I ■ ■ ■ ■ 1 I 6640 

TTTTTACTTCAAAATTTAGTTAGATTTCATATATACTC ATTTGAACCAGACTGTCAATGGTTACGAATTAGTCACTCCGT 



. • Trp His Lys He Leu Ser Ala 
I AMP 



AMP- 

CCTATCTC AGCGATCTGTCTATTTCGTTCATCC ATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGG 
■ ■ ■ ■ I | | , ■ , , i ■ ■ ■ ■ | | | ■ ■ ■ ■ | | 6720 

GGATAGAGTCGCTAGACAGATAAAGCAAGTAGGTATCAACGGACTGAGGGGCAGCACATCTATTGATGCTATGCCCTCCC 

Gly He Glu Ala He Gin Arg Asn Arg Glu Asp Met Thr Ala Gin Ser Gly Thr Thr Tyr He Val Val lie Arg Ser Pro 
AMP 



CTTACC ATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTC ACCGGCTCC AGATTTATCAGCAAT AAACCAGC 

■ ■ ■ ■ * ■ ■ ■ ■ I I | .... i .... | .... i ■ ... | | ■ ■ ■ ■ | i 6800 

GAATGGTAGACCGGGGTCACGACGTTACTATGGCGCTCTGGGTGCGAGTGGCCGAGGTCTAAATAGTCGTTATTTGGTCG 

Lys Gly Asp Pro Gly Leu Ala Ala He He Gly Arg Ser Gly Arg Glu Gly Ala Gly Ser Lys Asp Ala He Phe Trp Gly 
AMP 



CAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGC AACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCT 
I I" 1 ■ ■ ■ ■ 1 ... | .... i .... | .... i .... | | 6880 

GTCGGCCTTCCCGGCTCGCGTCTTCACCAGGACGTTGAAATAGGCGGAGGTAGGTCAGATAATTAACAACGGCCCTTCGA 

Ala Pro Leu Ala Ser Arg Leu Leu Pro Gly Ala Val Lys Asp Ala Glu Met Trp Asp He Leu Gin Gin Arg Ser Ala 
AMP 



Fspl pstl 



AGAGTAAGTAGTTCGCC AGTTAATAGTTTGCGC AACGTTGTTGCCATTGCTGCAGGCATCGTGGTGTC ACGC TCGTCGTT 

■ ■ ■ ■ i ■ ■ ■ ■ I I ■ ■ ■ ■ | ■ ... i .... | | | | j 6950 

TCTCATTCATCAAGCGGTCAATTATCAAACGCGTTGCAACAACGGTAACGACGTCCGTAGCACCACAGTGCGAGCAGCAA 

Leu Thr Leu Leu Glu Gly Thr Leu Leu Lys Ana Leu Thr Thr Ala Met Ala Ala Pro Met Thr Thr Asp Arq Glu Asp Asn 
AM p 



TGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATC AAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTA 

■ ■ ■ ■ i ■ ■ ■ ■ I i .... i .... | | | | ■ ■ ■ ■ | | 7040 

ACCATACCGAAGTAAGTCGAGGCCAAGGGTTGCTAGTTCCGCTCAATGTACTAGGGGGTACAACACGTTTTTTCGCCAAT 

Pro He Ala Glu Asn Leu Glu Pro Glu Trp Arg Asp Leu Arg Thr Val His Asp Gly Met Asn His Leu Phe Ala Thr Leu 
AMP 
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Pvul 



GCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAAT 

— ■ I 1 I I 1 I I | | 7120 

CGAGGAAGCCAGGAGGCTAGCAACAGTCTTCATTCAACCGGCGTCACAATAGTGAGTACCAATACCGTCGTGACGTATTA 

Glu Lys Pro Gly Gly He Thr Thr Leu Leu Leu Asn Ala Ala Thr Asn Asp Ser Met Thr He Ala Ala Ser Cys Leu 
AMP 



TCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTC ATTCTGAGAATAGTGTAT 

I — ■ I ■ ■ 1 I 1 I I I 7200 

AGAGAATGACAGTACGGTAGGC ATTCTACGAAAAGACACTGACCACTCATGAGTTGGTTCAGTAAGACTCTTATCACATA 

Glu Arg Val Thr Met Gly Asp Thr Leu His Lys Glu Thr Val Pro Ser Tyr Glu Val Leu Asp Asn Gin Ser Tyr His He 
AMP 



p 



iral 



GCGGCGACCGAGTTGCTCTTGCCCGGCGTCAACACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCA 
1 ' ' ' I I i ■ ... i ■ ... | ■ ... i ■ ... | | | ,,, ) ,,,. | 728O 

CGCCGCTGGCTCAACGAGAACGGGCCGCAGTTGTGCCCTATTATGGCGCGGTGTATCGTCTTGAAATTTTCACGAGTAGT 

Arg Arg Gly Leu Gin Glu Gin Gly Ala Asp Val Arg Ser Leu Val Ala Gly Cys Leu Leu Val Lys Phe Thr Ser Met Met 
AMP 



TTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCA 

' ' ' ' I 1 I 1 1 1 ' ' 1 1 I 1 ■ 1 ■ I I ■ ■ ■ ■ I I 7360 

AACCTTTTGCAAGAAGCCCCGCTTTTGAGAGTTCCTAGAATGGCGACAACTCTAGGTCAAGCTACATTGGGTGAGCACGT 

Pro Phe Arg Glu Glu Pro Arg Phe Ser Glu Leu He Lys Gly Ser Asn Leu Asp Leu Glu He Tyr Gly Val Arg Ala 
AMP 



CCCAACTGATCTTC AGC ATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAAC AGGAAGGC AAAATGCCGCAAAAAA 

I I — 1 1 I I I I I I 7440 

GGGTTGACTAGAAGTCGTAGAAAATGAAAGTGGTCGCAAAGACCC ACTCGTTTTTGTCCTTCCGTTTTACGGCGTTTTTT 

Gly Leu Gin Asp Glu Ala Asp Lys Val Lys Val Leu Thr Glu Pro His Ala Phe Val Pro Leu Cys Phe Ala Ala Phe Phe 
AMP 



GGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATT 

I 1 I ' ' I I I I I 1 ■ I 7520 

CCCTTATTCCCGCTGTGCCTTTACAACTTATGAGTATGAGAAGGAAAAAGTTATAATAACTTCGTAAATAGTCCC AATAA 

Pro He Leu Ala Val Arg Phe His Gin He Ser Met, 
AMP 1 



PspHI 

GTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTG 

I I 1 — 11 I I I I 1 1 1 1 I 1 1 " I 7600 

CAGAGTACTCGCCTATGTATAAACTTACATAAATCTTTTTATTTGTTTATCCCCAAGGCGCGTGTAAAGGGGCTTTTCAC 



.BspHI 

CCACCTGACGTCTAAGAAACCATTATTATC ATGACATTAACCTATAAAAATAGGCGTATC ACGAGGCCCTTTCGTCTTCA 

1 — ' I I H I ■ ■ — I I — + 7680 

GGTGGACTGCAGATTCTTTGGTAATAATAGTACTGTAATTGGATATTTTTATCCGCATAGTGCTCCGGGAAAGCAGAAGT 



AGAAT 

■ 1 1 ■ > 7685 

TCTTA 




FIG. 11 



Thursday, June 1 3, 2002 3:55 PM F I G . 1 2 p age 1 

GD2415(pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 
Enzymes : 35 of 538 enzymes (Filtered) 

Settings : Circular, Certain Sites Only, Standard Genetic Code 

Xmnl ~ 

GAATTAATTCATACC AGATC ACCGAAAACTGTCCTCCAAATGTGTCCCCCTC AC ACTCCCAAATTCGCGGGCTTCTGCCT 

I ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ | | | | ■ ■ ■ ■ I ■ ■ ■ ■ I I 80 

CTTAATTAAGT ATGGTCTAGTGGCTTTTGACAGGAGGTTTACACAGGGGGAGTGTGAGGGTTTAAGCGCCCGAAGACGGA 

Sacll 

CTTAGACC ACTCT ACCCTATTCCCCACACTCACCGGAGCCAAAGCCGCGGCCCTTCCGTTTCTTTGCTTTTGAAAGACCC 

■ ■ ■ ■ I 1 I I I I 1 I 160 

GAATCTGGTGAGATGGGATAAGGGGTGTGAGTGGCCTCGGTTTCGGCGCCGGGAAGGC AAAGAAACG AAAACTTTCTGGG 



5' LTR 



MoMSV 5' LTFr 
Nhel 

I 

CACCCGTAGGTGGCAAGCTAGCTTAAGTAACGCCACTTTGC AAGGCATGGAAAAAT AC AT AACTGAGAATAGAAAAGTTC 
I | ■ ... i .... | | i | | | 240 

GTGGGCATCC ACCGTTCGATCGAATTCATTGCGGTGAAACGTTCCGTACCTTTTTATGTATTGACTCTTATCTTTTCAAG 

5' LTR 



•MoMSV 5' LTR- 



AGATC AAGGTC AGGAAC AAAGAAACAGCTGAATACC AAACAGGATATCTGTGGTAAGCGGTTCCTGCCCCGGCTC AGGGC 

I I I I i ■ ■ ■ ■ | 1 I 320 

TCTAGTTCC AGTCCTTGTTTCTTTGTCGAC TTATGGTTTGTCCTAT AGACACCATTCGCCAAGGACGGGGCCGAGTCCCG 

5Tm 



■MoMSV 5' LTR- 



CAAGAAC AGATGAGACAGCTGAGTGATGGGCC AAACAGGATATCTGTGGTAAGC AGTTCCTGCCCCGGCTCGGGGCCAAG 
I ■ ... | ... ■ | | | | | | z|00 

GTTCTTGTCTACTCTGTCGACTCACTACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGGGGCCGAGCCCCGGTTC 

__ 



■ MoMSV 5' LTR- 



AACAGATGGTCCCCAGATGCGGTCCAGCCCTC AGC AGTTTCTAGTGAATCATCAGATGTTTCC AGGGTGCCCC AAGGACC 
I ■ ■ ■ — I I I | .... i .... | | | 480 

TTGTCTACCAGGGGTCTACGCCAGGTCGGGAGTCGTCAAAGATCACTTAGTAGTCTACAAAGGTCCCACGGGGTTCCTGG 

__ 



•MoMSV 5' LTR- 



TGAAAATGACCCTGTACCTTATTTGAACTAACCAATCAGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCCGCTCTCCGAGC 
I I I I I I | .... i .... | 560 

ACTTTTACTGGGACATGGAATAAACTTGATTGGTTAGTC AAGCGAAGAGCGAAGAC AAGCGCGCG AAGGCGAGAGGCTCG 



■MoMSV 5' LTR- 



Sacl AscI Smal Kpnl 

i II 

TC AATAAAAGAGCCCACAACCCCTC ACTCGGCGCGCCAGTCTTCCGATAGAC TGCGTCGCCCGGGTACCCGTATTCCC AA 

1 I I — H — I' " — m H 1 I 640 

AGTTATTTTCTCGGGTGTTGGGGAGTGAGCCGCGCGGTC AGAAGGCT ATCTGACGC AGCGGGCCC ATGGGC ATAAGGGTT 

__ 



•MoMSV 5' LTR- 



Thursday, June 13, 2002 3:55 PM FIG. 12 (COflt) Page 2 
GD2415(pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

TAAAGCCTCTTGCTGTTTGC ATCCGAATCGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTG AGTGATTGACTACCC AC 
1 1 1 ■ I i i ■ ... i ■ ■■■ | .... i ■ ■■■ i | | ■ .. . i ■■■■ | | 720 

ATTTCGGAGAACGAC AAACGTAGGCTTAGC ACCAGAGCGACAAGGAACCCTCCC AGAGG AGACTC ACTAACTGATGGGTG 



MoMSV 5' LTR- 



GACGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATTTGGAGACCCCTGCCCAGGGACCACCGACCCACCACCGGGAGGTA 

' 1 ' 1 1 1 1 ' ' 1 | ■ , ■ ■ i ■ ■ ■ , | , , , , | | , , . , | | | 8 oo 

CTGCCCCCAGAAAGTAAACCCCCGAGCAGGCCCTAAACCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGTGGCCCTCC AT 

5' LTR y 
— MoMSV 5' LTR 1 



Spel 

. i 

AGCTGGCCAGCAACTTATCTGTGTCTGTCCGATTGTCTAGTGTCTATGTTTGATGTTATGCGCCTGCGTCTGTACTAGTT 

1 t I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I 880 

TCGACCGGTCGTTGAATAGAC AC AGACAGGCTAACAGATC AC AGATACAAACTAC AAT ACGCGGACGCAGACATGATC AA 

^ ^Pkg Rgn^^— 
' Extended Packaging Region 



AGCTAACTAGCTCTGTATCTGGCGGACCCGTGGTGG AACTGACGAGTTCTGAACACCCGGCCGC AACCCTGGGAGACGTC 

I I I "" I I I I I 960 

TCGATTGATCGAGAC ATAGACCGCCTGGGCACC ACCTTGACTGCTCAAGACTTGTGGGCCGGCGTTGGGACCC TCTGC AG 
^ — — Pkg Rgn^^^^^^^^^^~^^ 
Extended Packaging Region 



CCAGGGACTTTGGGGGCCGTTT TTGTGGCCCGACCTGAGGAAGGGAGTCGATGTGGAATCCGACCCCGTCAGGATATGTG 

I 1 1 1 1 1 1 1 1 1 I I I I I I I 1040 

GGTCCCTGAAACCCCCGGCAAAAACACCGGGCTGGACTCCTTCCCTC AGCTACACCTTAGGCTGGGGCAGTCCTATAC AC 
^ — — Pkg Rgn ^^^^^m-^^m 

Extended Packaging Region 



GTTCTGGTAGGAGACGAGAACCTAAAACAGTTCCCGCCTCCGTCTGAATTTTTGCTTTCGGTTTGGAACCGAAGCCGCGC 

1 i I I I I I 1 1 120 

CAAGACC ATCCTCTGCTCTTGGATTTTGTC AAGGGCGGAGGCAGACTTAAAAACGAAAGCC AAACCTTGGCTTCGGCGCG 
— — — — — Pkg R gn ^^— — i^— — — — 
Extended Packaging Region 



Pstl Pstl 

! I 

GTCTTGTCTGCTGC AGCGCTGCAGCATCGTTCTGTGTTGTCTCTGTCTGACTGTGTTTCTGTATTTGTCTGAAAATTAGG 
I I I I I ' ■ ■ ■ | ■ ... i ■ ... | .... i .... | 1200 

CAGAACAGACGACGTCGCGACGTCGT AGCAAGAC AC AAC AGAGACAGAC TGAC AC AAAGAC AT AAACAGACTTTTAATCC 

Pkg Rgn^ ^ 

Extended Packaging Region 



GCCAGACTGTTACCACTCCC TT AAGTTTGACCTTAGGTC ACTGGAAAGATGTCGAGCGGATCGCTCACAACCAGTCGGTA 

— 1 — 1 — I I I I 1 — I 1 1 — I ■ ■ ■ — I 1280 

CGGTCTGACAATGGTGAGGGAATTCAAACTGGAATCCAGTGACCTTTCTACAGCTCGCCTAGCGAGTGTTGGTCAGCCAT 



Pkg Rgn 
Extended Packaging Region 
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GD2415 (pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

PstI 

GATGTCAAGAAGAGACGTTGGGTTACCTTCTGCTCTGC AGAATGGCC AACCTTT AACGTCGGATGGCCGCGAGACGGCAC 

I I i ■ ■ ■ — I I I | , ... i ■ | 1360 

CTACAGTTCTTCTCTGCAACCCAATGGAAGACGAGACGTCTTACCGGTTGGAAATTGCAGCCTACCGGCGCTCTGCCGTG 

^ ^ — — — ■ Pkg Rgn—— — — — — 



• Extended Packaging Region ■ 



CTTTAACCGAGACCTCATCACCCAGGTTAAGATCAAGGTCTTTTC ACCTGGCCCGC ATGGACACCC AGACCAGGTCCCCT 

I I I I I I I I 1440 

GAAATTGGCTCTGGAGTAGTGGGTCCAATTCTAGTTCCAGAAAAGTGGACCGGGCGTACCTGTGGGTCTGGTCCAGGGGA 

i^— Pkg Rgn— ^ 

Extended Packaging Region 



AC ATCGTGACCTGGGAAGCCTTGGCTTTTGACCCCCCTCCCTGGGTC AAGCCCTTTGTAC ACCCTAAGCCTCCGCCTCCT 

I I I I ■ ■ ■ ■ i ■ ■ ■ ■ 1 I I I 1520 

TGTAGCACTGGACCCTTCGGAACCGAAAACTGGGGGGAGGGACCCAGTTCGGGAAACATGTGGGATTCGGAGGCGGAGGA 

— Pkg Rgn^^^^^^^^^^^^^^^^^^^— 

Extended Packaging Region 



CTTCCTCCATCCGCCCCGTCTCTCCCCCTTGAACCTCCTCGTTCGACCCCGCCTCGATCCTCCCTTTATCCAGCCCTCAC 

I I I I I" — I — - I — I- 1600 

GAAGGAGGTAGGCGGGGCAGAGAGGGGGAACTTGGAGGAGCAAGCTGGGGCGGAGCTAGGAGGGAAATAGGTCGGGAGTG 

^ Pkg Rgn-"——--———-——-———-——— 

Extended Packaging Region 



Narl EcoRI Bell 

i i 

TCCTTCTCTAGGCGCCGGAATTCCGATCTGATC AAGAGACAGGATGAGGATCGTTTCGCATGATTGAACAAGATGGATTG 
■ ■ ■ ■ I | .... i .... | | | | | , ■ ■ , i ■ ■ ■ ■ | 1680 

AGGAAGAGATCCGCGGCCTTAAGGCTAGACTAGTTCTCTGTCCTACTCCTAGCAAAGCGTACTAACTTGTTCTACCTAAC 

^"Pkg Rgn— J Met lie Glu Gin Asp Gly Leu 

- Extended Packaging 1 Neomycin Phosphotransfer- 



CACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGC 

I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ . ■ i i , ■ ■ ■ I I | ■ ■■■ i ■ ■■■ | 1760 

GTGCGTCC AAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACT ACG 

His Ala Gly Ser Pro Ala Ala Trp Vai Glu Arg Leu Phe Gly Tyr Asp Trp Ala Gin Gin Thr lie Gly Cys Ser Asp Ala 
Neomycin Phosphotransferase 



Narl 

CGCCGTGTTCCGGCTGTC AGCGC AGGGGCGCCCGGTTCTTTTTGTC AAGACCGACCTGTCCGGTGCCCTGAATGAACTGC 
I ' ' ■ ' | ■ ... i ■ ... | ... ■ | ■ ■ . ■ | | 1 | 1840 

GCGGCACAAGGCCGAC AGTCGCGTCCCCGCGGGCC AAGAAAAACAGTTCTGGCTGGAC AGGCCACGGGACTT ACTTGACG 

Ala Vol Phe Arg Leu Ser Ala Gin Gly Arg Pro Val Leu Phe Vai Lys Thr Asp Leu Ser Gly Ala Leu Asn Glu Leu 
Neomycin Phosphotransferase 



PstI Fspl 

i 

AGGACGAGGC AGCGCGGCTATCGTGGCTGGCC ACG ACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTCACTGAAGCG 

■ ■ ■ ■ I 1 1 1 1 I I I I I | .... i .... | 1920 

TCCTGC TCCGTCGCGCCGATAGC ACCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGC AAC AGTGACTTCGC 

Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala 
Neomycin Phosphotransferase 



Thursday, June 1 3, 2002 3:55 PM ^ ^ ^ m ^ ^ ( C O fl t ) p 4 

GD2415(pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

GGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATC 

I 1 1 1 1 I I ■ ■ ■ 1 I ■ ■ ■ ' I I I I 2000 

CCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGACAGTAGAGTGGAACGAGGACGGCTCTTTCAT AG 

Gly Arg Asp Trp Leu Leu Leu Gly Glu Vol Pro Gly Gin Asp Leu Leu Ser Ser His Leu Ala Pro Ala Glu Lys Vol Ser 
Neomycin Phosphotransferase 

CATCATGGCTGATGC AATGCGGCGGCTGCATACGCTTG ATCCGGCTACCTGCCCATTCGACCACCAAGCGAAAC ATCGCA 

1 1 1 1 I i ■ ... i ■ ... [ 1 ■ ■ ■ I I I I ■ ... i ... ■ | 2080 

GTAGTACCGACTACGTT ACGCCGCCGACGTATGCGAACTAGGCCGATGGACGGGTAAGCTGGTGGTTCGCTTTGTAGCGT 

He Met Ala Asp Ala Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin Ala Lys His Arg 
Neomycin Phosphotransferase 

TCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATC AGGATGATCTGGACGAAGAGC ATC AGGGGCTCGCGCCA 

I I I I I I I I 2160 

AGCTCGCTCGTGC ATGAGCC TACCTTCGGCCAGAAC AGCTAGTCCTACTAGACCTGCTTCTCGTAGTCCCCGAGCGCGGT 

He Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro 
Neomycin Phosphotransferase 



pphl Ncol 

GCCGAAC TGTTCGCC AGGCTC AAGGCGCGC ATGCCCGACGGCGAGGATCTCGTCGTGACCC ATGGCGATGCCTGCTTGCC 

I I I ■ ■ ■ ■ I I ■ ■ ■ 1 I I I 2240 

CGGCTTGACAAGCGGTCCGAGTTCCGCGCGTACGGGCTGCCGCTCCTAGAGCAGCAC TGGGTACCGCTACGGACGAACGG 

Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu Val Val Thr His Gly Asp Ala Cys Leu Pro 
Neomycin Phosphotransferase 



Nael 

GAATATCATGGTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCTATCAGGACA 

I I 1 1 1 1 I 1 — I I I I I 2320 

CTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGT AGC TGACACCGGCCGACCC AC ACCGCCTGGCGATAGTCCTGT 

Asn He Met Vol Glu Asn Gly Arg Phe Ser Gly Phe He Asp Cys Gly Arg Leu Gly Vol Ala Asp Arg Tyr Gin Asp 
Neomycin Phosphotransferase 



TAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCC 
I I | ■ ... i ■ ... | | | | | 2400 

ATCGC AACCGATGGGCACTAT AACGACTTCTCGAACCGCCGCTTACCCGACTGGCGAAGGAGCACGAAATGCC ATAGCGG 

He Ala Leu Ala Thr Arg Asp He Ala Glu Glu Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly He Ala 
Neomycin Phosphotransferase 



GCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACC 

I I I — I 1 1 — 1 ) ■ 1 ■ ■ I — ■ ■ ■ I I I 2480 

CGAGGGCTAAGCGTCGCGTAGCGGAAGATAGCGGAAGAACTGCTCAAGAAGACTCGCCCTGAGACCCCAAGCTTTACTGG 

Ala Pro Asp Ser Gin Arg He Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe • 
Neomycin Phosphotransferase ' 



GACCAAGCGACGCCCAACCTGCCATC ACGAGATTTCGATTCC ACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGT 

I I I — I I I I 1 2560 

CTGGTTCGCTGCGGGTTGGACGGTAGTGCTCT AAAGCT AAGGTGGCGGCGGAAGATACTTTCC AACCCGAAGCCTTAGCA 



Smal 

TTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCGGGCTCGATCCCC 

m ■ ■ I 1 1 H I ■ ■ ■ — I — I — ■ — I I 2640 

AAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGGCCCG AGCTAGGGG 



Nael 
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Nrul 

TCGCGAGTTGGTTCAGCTGCTGCCTGAGGCTGGACGACC TCGCGGAGTTCTACCGGCAGTGCAAATCCGTCGGC ATCC AG 

I ■ ... ) ■ ... i .... | I I I I I 2720 

AGCGCTCAACCAAGTCGACGACGGACTCCGACCTGCTGGAGCGCCTC AAGATGGCCGTCACGTTTAGGCAGCCGTAGGTC 

Pstl 

GAAACC AGCAGCGGCTATCCGCGC ATCC ATGCCCCCG AACTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGC 

I | ,,,, i .... | | — I ■ ■ I ■ ■ ■ — i — - — H I 2800 

CTTTGGTCGTCGCCGATAGGCGCGTAGGTACGGGGGC TTGACGTCCTC ACCCCTCCGTGC TACCGGCGAAACCAGCTCCG 

BamHI 

GGATCCTAGCAGAAAAAT AAGACTTGATTCCCCCTTAAAATTACAACTGCTAGAAAATGAATGGCTCTCCCGCC TTTTTT 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I I I I I 2880 

CCTAGGATCGTCTTTTTATTCTGAACTAAGGGGGAAT TTTAATGTTGACGATCTTTTACTT ACCGAGAGGGCGGAAAAAA 



Blv Pro 



Blv Promoter 

Narl 

GAGGGGGAATC ATTTGT ATGAAAGATCATGCCGACCT AGGCGCCGCCACCGCCCCGTAAACC AGACAGAGACGTC AGCTG 

I I I ■ ■ ■ I I I I I 2960 

CTCCCCCTTAGTAAACATACTTTCTAGTACGGCTGGATCCGCGGCGGTGGCGGGGCATTTGGTCTGTCTCTGCAGTCGAC 

' " Blv Pro * 



■Blv Promoter - 



CC AGAAAAGCTGGTGACGGCAGCTGGTGGCTAGAATCCCCGTACCTCCCCAACTTCCCCTTTCCCGAAAAATCCAC ACCC 

I .. . . i .... I I 1 ■ ■ 1 i ■ ■ ■ 1 I I I I I 3040 

GGTCTTTTCGACC ACTGCCGTCGACCACCGATCTTAGGGGC ATGGAGGGGTTGAAGGGGAAAGGGCTTTTTAGGTGTGGG 

Blv Pro 



Blv Promoter 

Nael 

I 

TGAGCTGCTGACCTC ACCTGCTGATAAATTAATAAAATGCCGGCCCTGTCGAGTTAGCGGC ACCAGAAGCGTTCTTCTCC 

■ ' ■ 1 ' ■ ' ■ ' I I ■ 1 ■ ■ I ■ ■ ■ ■ | ■ ... i .... | .... i .... | .... t ■ ... i I 3120 

ACTCGACGACTGGAGTGGACGACT ATTT AATT ATTTT ACGGCCGGGAC AGCTCAATCGCCGTGGTCTTCGC AAGAAGAGG 
_____ Blv Pro ~~~~~ 



Blv Promoter 

Xhol Hindlll 

I I 

TGAGACCCTCGTGCTCAGCTCTCGGTCCTGCCTCGAGAAGCTTGTTATC AC AAGTTTGTACAAAAAAGCAGGCTTCGAAG 
I I | .... i .... | | | ■ . , ■ i | 3200 

ACTCTGGGAGC ACGAGTCGAGAGCCAGGACGGAGCTCTTCGAACAATAGTGTTC AAACATGTTTTTTCGTCCGAAGCTTC 



Blv Pro 



•Blv Promoter- 



y I att B1 I 
-- 1 attB1 1 
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+ 



■+ 3280 



CTCTATCTTGGTTAAGAGATTCCTTTATGAATTGCAGCTGACCTAGGCCATGGCTTAAGCTAGGTGTACGGATTTTTTGC 

.Met Pro Lys Lys Arg 
1 Brex M4 



Page 6 



Kpnl 

Xmnl Sail BamHI EcoRI 

! I I 

GAGATAGAACCAAT.TCTC T AAGGAAATACTT AACGTCGACTGGATCCGGTACCGAATTCGATCCAC ATGCCT AAAAAACG 



^pal 

ACGGTCCCGAAGACGCCC AC AACCGATCATC AGATGGC AAGTGTTGTTGGTTGGGGGCCCC ACTCTCTAC ATGCCTGCCC 

— H — ■ — — I I I ■ ■ ■ ■ I I I 3360 

TGCC AGGGCTTCTGCGGGTGTTGGCTAGT AGTC TACCGTTCACAAC AACCAACCCCCGGGGTGAGAGATGTACGGACGGG 

Arg Ser Arg Arg Arg Pro Gin Pro He lie Arg Trp Gin Vol Leu Leu Vol Gly Gly Pro Thr Leu Tyr Met Pro Alo 
Brex M4 



Clal Apal 



GGCCCTGGTTTTGTCCAATGATGTC ACCATCGATGCCTGGTGCCCCCTCTGCGGGCCCCATGAGCGACTCCAAT TCGAAA 

■ ■ ■ ■ | ■ ■■■ i ■ ■■■ i | | | ■ 1 ■ ■ I ■ I ■ ■ ■ ■ I 3440 

CCGGGACC AAAACAGGTTACT AC AGTGGT AGCT ACGGACC ACGGGGGAGACGCCCGGGGTACTCGC TGAGGTTAAGCTTT 

Arg Pro Trp Phe Cys Pro Met Met Ser Pro Ser Met Pro Gly Alo Pro Ser Alo Gly Pro Met Ser Asp Ser Asn Ser Lys 
Brex M4 



GGATCGACACCACGCTCACCTGCGAGACCCACCGTATCAACTGGACCGCCGATGGACGACCTTGCGGCCTCAATGGAACG 
I | ■ ... i ■■■■ [ ■■■, i ■■■■ | | | | ■ . ■ ■ i 3520 

CCT AGCTGTGGTGCGAGTGGACGCTCTGGGTGGC ATAGTTGACCTGGCGGCTACCTGCTGGAACGCCGGAGTTACC TTGC 

Gly Ser Thr Pro Arg Ser Pro Ala Arg Pro Thr Val Ser Thr Gly Pro Pro Met Asp Asp Leu Ala Ala Ser Met Glu Arg 
Brex M4 



Apal 

TTGTTCCCTCGACTGC ATGTC TCC GAG ACCCGCCCCC AAGGGCCCCGACGACTCTGGATC AAC TGCC CCCTTCCGGCC GT 

■ ■ ■ ■ i ■ ■ ■ ■ I I | , ... i ■ ... i | | i i 3600 

AAC AAGGGAGCTGACGT AC AG AGGCTCTGGGCGGGGGTTCCCGGGGCTGCTGAGACCT AGTTG ACGGGGGAAGGCCGGC A 

Cys Ser Leu Asp Cys Met Ser Pro Arg Pro Ala Pro Lys Gly Pro Asp Asp Ser Gly Ser Thr Ala Pro Phe Arg Pro 
Brex M4 



Bglll 

TCGCGCTCAGCCCGGCCCGGTTAGATCTTCCCCCTTCGAGCGGTCCCCCTTCCAGCCCTACCAATGCCAATTGCCCTCGG 

I ■ ■ ■ ■ i ■ ■ ' ' I t | ■ ... i ■ ■■■ | i | | 3680 

AGCGCGAGTCGGGCCGGGCCAATCTAGAAGGGGGAAGCTCGCCAGGGGGAAGGTCGGGATGGTTACGGTTAACGGGAGCC 

Phe Ala Leu Ser Pro Ala Arg Leu Asp Leu Pro Pro Ser Ser Gly Pro Pro Ser Ser Pro Thr Asn Ala Asn Cys Pro Arg 
Brex M4 



CCTCTAGCGACGGTTGCCCCATT ATCGGGCACGGCCTTCTTCCCTGGAACAACTTAGTAACGCATCCTGTCCTC AGAAAA 

■ ■ ■ ■ I I I I I ■ ■ . . | . . ■ ■ i ■ ■ ■ ■ | 376O 

GGAGATCGCTGCCAACGGGGT AATAGCCCGTGCCGGAAGAAGGGACCTTGTTGAATCATTGCGTAGGAC AGGAGTCTTTT 

Pro Leu Ala Thr Val Ala Pro Leu Ser Gly Thr Ala Phe Phe Pro Gly Thr Thr • 
Brex M4 1 
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Xhol 



Xbal 



Clal 



GTCCTTATATTAAATCAAATGGGACCTCGAGATATCT AGACCCAGCTTTCTTGTAC AAAGTGGTGATAAC ATCGATAATC 

■ ■ ■ ■ I 1 1 1 I 1 I 1 3840 

CAGGAATATAATTTAGT T TACCCTGGAGCTCT ATAGATC TGGGTCGAAAGAACATGTTTCACC AC TATTGTAGCTATTAG 



att B2 



■att B2- 



l WP- 



AACCTCTGGATTACAAAATTTGTGAAAGATTGAC TGGTATTCTTAACT ATGTTGCTCCTTTTACGCTATGTGGATACGCT 

■ ■ ■ ■ i ■ ■ ■ ■ I 1 1 ■ ■ ■ ■ i ■ ■ ■ ■ 1 I I 1 I 3920 

TTGGAGACCTAATGTTTTAAACACTTTCTAACTGACCAT AAGAATTGAT ACAACGAGGAAAATGCGATACACCT ATGCGA 



•WPRE- 



GC TTT AATGCCTT TGTATCATGCTATTGCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTAT AAATCCTGGTTGC TGTC 
I I ■ • ■ ■ I I ■ ■ ■ ■ I I 1 I 4000 

CGAAATTACGGAAAC AT AGTACGATAACGAAGGGC ATACCGAAAGT AAAAGAGGAGGAAC ATATTT AGGACC AACGAC AG 



■WPRE- 



TCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGTGCACTGTGTTTGCTGACGCAACCCCCACTGGTT 

- I I I I I I ■ ■ ■ ■ I I 4080 

AGAAATACTCCTC AACACCGGGCAAC AGTCCGTTGCACCGC ACCACACGTGACACAAACGACTGCGTTGGGGGTGACC AA 



-WPRE- 



GGGGCATTGCC AC CACCTGTC AGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCT ATTGCCACGGCGGAACTC ATCGCC 

I | 1 | | I I 4160 

CCCCGTAACGGTGGTGGACAGTCGAGGAAAGGCCCTGAAAGCGAAAGGGGGAGGGATAACGGTGCCGCCTTGAGTAGCGG 



•WPRE- 



GCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGTTGGGCACTGAC AATTCCGTGGTGTTGTCGGGGAAATC ATCGTC 
H I ' I I 1 1 1 1 i 1 1 1 1 I I H I 4240 

CGGACGGAACGGGCGACGACCTGTCCCCGAGCCGACAACCCGTGACTGTTAAGGCACCACAACAGCCCCTTTAGTAGCAG 



•WPRE- 



CTTTCCTTGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTGCTACGTCCCTTCGGCCCTCAATC 
' I I 1 ■ I I I I I ■ *h — . — H- 4320 

GAAAGGAACCGACGAGCGGACACAACGGTGGACCT AAGACGCGCCCTGC AGGAAGACG ATGCAGGGAAGCCGGG AGTTAG 



■WPRE- 



Sacll 



Nael 



CAGCGGACCTTCCTTCCCGCGGCCTGCTGCCGGCTCTGCGGCCTCTTCCGCGTCTTCGCCTTCGCCCTCAGACGAGTCGG 

I 1 I ■ ■ ■ ■ I 1 I I 1 4400 

GTCGCCTGGAAGGAAGGGCGCCGGACGACGGCCGAGACGCCGGAGAAGGCGC AGAAGCGGAAGCGGGAGTCTGCTC AGCC 



WPRE 



TH^ne^o™ FIG. 12 (conlt) 

GD2415(pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

pal 

ATCTCCCTTTGGGCCGCCTCCCCGCCTGATCGATAAAATAAAAG ATTTT ATTTAGTCTCCAGAAAAAGGGGGGAATGAAA 

I I I I I I I I 4480 

TAGAGGGAAACCCGGCGGAGGGGCGGACTAGCTATTTTATTTTCT AAAATAAATCAGAGGTCTTTTTCCCCCCTTACTTT 



=0 | 3' LTR 



WPRE 1 l MoMuL- 

Nhel 

| 

GACCCC ACCTGT AGGTTTGGC AAGCTAGCTT AAGTAACGCC ATTTTGC AAGGCATGGAAAAAT AC AT AACTGAGAATAGA 

I I I — ■ ■ ■ I 1 I I I 4560 

CTGGGGTGGACATCCAAACCGTTCGATCGAATTCATTGCGGTAAAACGTTCCGTACCTTTTTATGTATTGACTCTTATCT 

3TfR 



•MoMuLV 3' LTR- 



GAAGTTCAGATC AAGGTC AGGAAC AGATGGAACAGCTGAAT ATGGGCC AAAC AGGATATCTGTGGT AAGC AGTTCCTGCC 

I I 1 ■ ■ ■ ■ I I ■ ■ ■ ■ i ■ ■ ■ ■ I I I 4640 

CTTCAAGTCTAGTTCCAGTCCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGG 

3TTR 



•MoMuLV 3' LTR- 



CCGGCTCAGGGCCAAGAACAGATGGAACAGCTGAATATGGGCC AAAC AGGATATCTGTGGT AAGC AGTTCCTGCCCCGGC 

I | ... m .... | | | | ■ ■ ■ ■ i | 4720 

GGCCGAGTCCCGGTTCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGGGGCCG 

3' LTR 



•MoMuLV 3' LTR- 



Xbal 



TC AGGGCC AAGAAC AGATGGTCCCCAGATGCGGTCC AGCCCTC AGC AGTTTCTAGAGAACCATC AGATGTTTCCAGGGTG 
I I I I | ■ , ■ , i ■ ■ ■ ■ | | 1 4800 

AGTCCCGGTTCTTGTCTACCAGGGGTCTACGCCAGGTCGGGAGTCGTCAAAGATCTCTTGGTAGTCTACAAAGGTCCCAC 

__ 



•MoMuLV 3' LTR- 



CCCCAAGGACCTGAAATGACCCTGTGCCTTATTTGAACTAACC AATCAGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCTG 
I — h 1 h — I ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I — 1 4880 

GGGGTTCCTGGACTTTACTGGGACACGGAATAAACTTGATTGGTT AGTCAAGCGAAGAGCGAAGAC AAGCGCGCGAAGAC 

3TfR 



•MoMuLV 3' LTR- 



Sacl Narl Smal Kpnl 

CTCCCCGAGCTC AATAAAAGAGCCC AC AACCCCTC ACTCGGGGCGCC AGTCCTCCGATTGACTGAGTCGCCCGGGTACCC 

I I I I 1 1 I ■ - — I 4960 

GAGGGGCTCGAGTTATTTTCTCGGGTGTTGGGGAGTGAGCCCCGCGGTCAGGAGGCTAACTGACTCAGCGGGCCCATGGG 

__ 



•MoMuLV 3' LTR- 



Thursday, June 13, 2002 3:55 PM Fl^5» 12 ^COHt) Page 9 
GD2415 (pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

GTGTATCC AATAAACCCTCTTGC AGTTGC ATCCGACTTGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAGTGATTG 
H-~ — ^ — ■ ■ I I I — — —hh — ■ ■ \ — " I 1 1 ' — I — — + 5040 

CACATAGGTTATTTGGGAGAACGTCAACGTAGGCTGAACACCAGAGCGACAAGGAACCCTCCC AGAGGAGACTCACTAAC 

3Tm 



MoMuLV 3' LTR- 



ACTACCCGTCAGCGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATCGGGAGACCCCTGCCCAGGGACCACCGACCC ACCA 

1 ■ I I I ■ ■ ■ 1 1 ■ ' ■ ■ I | .... i .... | | 5120 

TGATGGGCAGTCGCCCCCAGAAAGTAAACCCCCGAGCAGGCCCTAGCCCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGT 



3' LTR 



•MoMuLV 3' LTR- 



CCGGGAGGTAAGC TGGCTGCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTC TGAC AC ATGC AGCTCCCGGAGACGGTC 

I 1 1 1 ■ I I I 1 i ■ ... i ■ ■■■ | | 5200 

GGCCCTCCATTCGACCGACGGAGCGCGCAAAGCCACTACTGCCACTTTTGGAGACTGTGTACGTCGAGGGCCTCTGCCAG 



AC AGCTTGTCTGT AAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTC AGCGGGTGTTGGCGGGTGTCGGGGCGC 

■ ■ ■ ■ 1 ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I 1 ■ ■ ■ ■ I ■ ■ I | .... m ... | 528O 

TGTCGAACAGACATTCGCCTACGGCCCTCGTCTGTTCGGGCAGTCCCGCGCAGTCGCCCACAACCGCCCACAGCCCCGCG 



AGCCATGACCC AGTC ACGT AGCGATAGCGGAGTGTAT ACTGGCTTAACTATGCGGC ATC AG AGCAGATTGTACTGAGAGT 

I 1 I 1 ■ ■ 1 1 ■ I I I I 5360 

TCGGTACTGGGTCAGTGCATCGCTATCGCCTCACATATGACCGAATTGATACGCCGTAGTCTCGTCTAACATGACTCTCA 



Ndel 

GCACCATATGCGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCA 

I I I I ' ■ 1 ■ I I I ■ ■ ■ ■ I 5440 

CGTGGTATACGCCACACTTTATGGCGTGTCTACGCATTCCTCTTTTATGGCGT AGTCCGCGAGAAGGCGAAGGAGCGAGT 



CTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAA 

I I ' ■ ■ ' I , | ■ . . i ■ ■ ■ ■ | | | 5520 

GACTGAGCGACGCGAGCC AGC AAGCCGACGCCGCTCGCCATAGTCGAGTGAGTTTCCGCC ATT ATGCC AATAGGTGTCTT 



TCAGGGGATAACGCAGGAAAGAAC ATGTGAGC AAAAGGCC AGC AAAAGGCC AGG AACCGTAAAAAGGCCGCGTTGCTGGC 

— I | | | | | | | 5600 

AGTCCCCTATTGCGTCCTTTCTTGTACACTCGTTTTCCGGTCGTTTTCCGGTCCTTGGCATTTTTCCGGCGCAACGACCG 



GTTTTTCCATAGGCTCCGCCCCCCTGACG AGC ATCACAAAAATCGACGCTC AAGTCAGAGGTGGCGAAACCCGAC AGG AC 

■ ■ ■ ■ 1 ■ ■ ■ ■ I I ■ ' ■ ' 1 I ■ ■ ■ ■ i ■ ■ ■ ■ 1 ■ ■ ■ ■ i ■ ■ ■ ■ I I 5680 

CAAAAAGGTATCCGAGGCGGGGGGACTGCTCGTAGTGTTTTTAGCTGCGAGTTCAGTCTCCACCGCTTTGGGCTGTCCTG 



TATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTT ACCGGATACCTG 

■ ■ I ■ | ■■■■ ] ■■■■ | | | I 1 ■ ■ 1 i ■ 1 1 ' 1 ■ ■ ■ ■ 1 5760 

ATATTTCTATGGTCCGCAAAGGGGGACCTTCGAGGGAGC ACGCGAGAGGACAAGGCTGGGACGGCGAATGGCCTATGGAC 

TCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCG 
I ■ ' ' ' I ■ ■ ■ ■ I 1 I I I 1 5840 

AGGCGGAAAGAGGGAAGCCCTTCGC ACCGCGAAAGAGTATCGAGTGCGACATCCATAGAGTC AAGCCACATCC AGC AAGC 



CTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTC AGCCCGACCGCTGCGCCTT ATCCGGT AACTATCGTCTTGAGTCCA 

I 1 ■ ■ ■ ■ i ■ ■ ■ ■ 1 I ■ ... | ... ■ | ■,, [ ,,,, ) 5920 

GAGGTTCGACCCG AC AC ACGTGCTTGGGGGGC AAGTCGGGCTGGCGACGCGGAATAGGCC ATTGATAGC AGAACTCAGGT 



Thursday, June 13, 2002 3:55 PM FIG- 12 (COPlt) p age 10 
GD2415(pLNBIv-M4W).MPD (1 > 7428) Site and Sequence 

ACCCGGTAAGACACGACTTATCGCCACTGGCAGC AGCC ACTGGTAAC AGGATT AGCAGAGCGAGGTATGTAGGCGGTGCT 

■ ■ ■ ■ I ■ ... | ■ ... i ■ ... | | | | | | 6000 

TGGGCCATTCTGTGCTGAATAGCGGTGACCGTCGTCGGTGACCATTGTCCTAATCGTCTCGCTCC ATAC ATCCGCC ACGA 

ACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTAC AC T AGAAGGAC AGTATTTGGT ATCTGCGCTCTGCTGAAGCC AGT 

I I I I I I I ■ ■ ■ - I 6080 

TGTCTCAAGAACT TC ACC ACCGGATTGATGCCGATGTGATCTTCCTGTC ATAAACCATAGACGCGAGACGACTTCGGTC A 

T ACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGC AAGC 

I ■ ■ ■ ■ i ■ ■ ■ ■ I I I I ■ I | .... i ■ ■■■ | 6160 

ATGGAAGCCTTTT TCTCAACC ATCGAGAACTAGGCCGTTTGTTTGGTGGCGACCATCGCCACC AAAAAAAC AAACGTTCG 

AGC AGATTACGCGCAGAAAAAAAGGATCTC AAGAAGATCCTTTGATC TTTTCTACGGGGTCTGACGCTC AGTGGAACGAA 

I 1 ■ ■ ■ ■ I 1 I ■ ■ ■ 1 i 1 1 1 1 I I I 6240 

TCGTC TAATGCGCGTCTTTTTTTCCT AGAGTTCTTCT AGGAAACT AGAAAAGATGCCCC AGACTGCGAGTC ACCTTGCTT 

BspHI Dral 

| I 

AAC TC ACGTTAAGGGATTTTGGTCATGAGATT ATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTT 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I I ■ ■ — I ■ — — h — ~- + 6320 

TTGAGTGCAATTCCCT AAAACCAGTACTCTAATAGTTTTTCCTAGAAGTGGATCT AGGAAAATTTAATTTTTACTTC AAA 

Dral 

TAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGA 

I I I I I I I ■ ■ ■ ■ i ■ ■ ■ ■ I 6400 

ATT TAGTTAGATTTC ATATATACTCATTTGAACC AGACTGTC AATGGTTACGAATTAGTC ACTCCGTGGATAGAGTCGCT 



. • Trp His Lys lie Leu Ser Ala Gly lie Glu Ala lie 

I h-l artamacp 



■ b- Lactamase 

TCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGC 

I I I I I I I I 6480 

AGACAGATAAAGC AAGTAGGTATC AACGGACTGAGGGGC AGC ACATC TATTGATGCTATGCCCTCCCGAATGGTAGACCG 

Gin Arg Asn Arg Glu Asp Met Thr Alo Gin Ser Gly Thr Thr Tyr lie Vol Vol lie Arg Ser Pro Lys Gly Asp Pro 
b-Lactamase 



CCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGC 

■ ... I ■ ... i ■ ... I | I I 1 ■ ■ ■ I ■ ... i , ... i ■ ... | 6560 

GGGTCACGACGTTACTATGGCGCTCTGGGTGCGAGTGGCCGAGGTCTAAAT AGTCGTTATTTGGTCGGTCGGCCTTCCCG 

Gly Leu Ala Ala lie He Gly Arg Ser Gly Arg Glu Gly Ala Gly Ser Lys Asp Ala lie Phe Trp Gly Ala Pro Leu Ala 
b-Lactamase 



CGAGCGCAGAAGTGGTCCTGC AAC TTT ATCCGCC TCCATCCAGTCT ATT AATTGTTGCCGGG A AGC TAG AGT AAGTAGTT 

1 .... i ■ ■■■ 1 ... ■ | ■ ... i ... ■ | | ■ ■ ■ ■ | | ■ ■ ■ | 6640 

GCTCGCGTCTTCACC AGGACGTTGAAATAGGCGGAGGT AGGTC AGATAATT AACAACGGCCCTTCGATCTCATTCATC AA 

Ser Arg Leu Leu Pro Gly Ala Val Lys Asp Ala Glu Met Trp Asp He Leu Gin Gin Arg Ser Ala Leu Thr Leu Leu Glu 
b-Lactamase 



^Fspl Pstl 

CGCCAGTTAAT AGTTTGCGCAACGTTGTTGCC ATTGCTGC AGGC ATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCA 

■ ■ ■ ■ I I I ■ ' ■ ■ I 1 1 ■ ■ I ■ ... | ■ ... i ■ ... | | 6720 

GCGGTC AATTATC AAACGCGTTGC AACAACGGTAACGACGTCCGT AGCACC ACAGTGCGAGCAGCAAACCATACCGAAGT 

Gly Thr Leu Leu Lys Arg Leu Thr Thr Ala Met Ala Ala Pro Met Thr Thr Asp Arg Glu Asp Asn Pro He Alo Glu 
b-Lactamase 



Thursday, June 13, 2002 3:55 PM FIG. 12 (COflt) p ag e11 
GD2415(pLNBIv-M4W).MPP (1 > 7428) Site and Sequence 

TTCAGCTCCGGTTCCC AACGATC AAGGCGAGTTACATGATCCCCC ATGTTGTGCAAAAAAGCGGTT AGCTCCTTCGGTCC 

I I I — ~H ■ ■ i ■ ■ ■ ■ | ■ ■ I I I 6800 

AAGTCGAGGCCAAGGGTTGCTAGTTCCGCTCAATGT ACTAGGGGGT AC AAC ACGTTTTTTCGCC AATCGAGGAAGCC AGG 

Asn Leu G!u Pro Glu Trp Arg Asp Leu Arg Thr Vol His Asp Gly Met Asn His Leu Phe Ala Thr Leu Glu Lys Pro Gly 
b-Lactamase 



Pvul 

TCCGATCGTTGTCAGAAGT AAGTTGGCCGC AGTGTTATC ACTCATGGTTATGGC AGCACTGC AT AATTCTCTTACTGTC A 

I I ■ ' ' I I I I I I 6880 

AGGCTAGC AACAGTCTTC ATTC AACCGGCGTC ACAATAGTGAGTACC AATACCGTCGTGACGT ATTAAGAGAATGAC AGT 

Gly lie Thr Thr Leu Leu Leu Asn Ala Ala Thr Asn Asp Ser Met Thr He Ala Ala Ser Cys Leu Glu Arg Vat Thr Met 
b-Lactamase 



TGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGT 

■ ■ ■ ■ I I I ■ ■ 1 ■ 1 I i I I 6960 

ACGGTAGGCATTCTACGAAAAGACACTGACCACTCATGAGTTGGTTCAGTAAGACTCTTATCACATACGCCGCTGGCTCA 

Gly Asp Thr Leu His Lys Glu Thr Vol Pro Ser Tyr Glu Vol Leu Asp Asn Gin Ser Tyr His He Arg Arg Gly Leu 
b-Lactamase 



Dral ; Xmnl 

TGCTCTTGCCCGGCGTCAAC ACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATC ATTGGAAAACGTTC 

i I I I ■ ■ ■ ■ I 1 I I 7040 

ACGAGAACGGGCCGCAGTTGTGCCCTATTATGGCGCGGTGTATCGTCTTGAAATTTTC ACGAGTAGT AACCTTTTGC AAG 

Gin Glu Gin Gly Ala Asp Val Arg Ser Leu Vol Ala Gly Cys Leu Leu Vol Lys Phe Thr Ser Met Met Pro Phe Arg Glu 
b-Lactamase 



TTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTT 

I I I I I I 1 I 7120 

AAGCCCCGCTTTTGAGAGTTCCTAGAATGGCGACAACTCTAGGTC AAGCT AC ATTGGGTGAGC ACGTGGGTTGACT AGAA 

Glu Pro Arg Phe Ser Glu Leu He Lys Gly Ser Asn Leu Asp Leu Glu He Tyr Gly Val Arg Ala Gly Leu Gin Asp Glu 
b-Lactamase 



C AGC ATCTTTTACTTTCACC AGCGTTTCTGGGTGAGC AAAAAC AGGAAGGC AAAATGCCGCAAAAAAGGGAATAAGGGCG 

I I I I I I I I 7200 

GTCGTAGAAAATGAAAGTGGTCGCAAAGACCCACTCGTTTTTGTCCTTCCGTTTTACGGCGTTTTTTCCCTTATTCCCGC 

Ala-Asp Lys Val Lys Val Leu Thr Glu Pro His Ala Phe Val Pro Leu Cys Phe Ala Ala Phe Phe Pro He Leu Ala 
b-Lactamase 



} BspHI 

ACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGG 
I ■ ... i ■ ... i , ... i | , , , , i | | | 728O 

TGTGCCTTTACAACTTATGAGTATGAGAAGGAAAAAGTTATAATAACTTCGT AAATAGTCCC AATAACAGAGT ACTCGCC 

Vol Arg Phe His Gin He Ser Met. 
b-Lactamase 1 



ATAC ATATTTGAATGT ATTTAGAAAAATAAACAAATAGGGGTTCCGCGC AC ATTTCCCCGAAAAGTGCCACCTGACGTCT 

I — ■ . | ■ ■ ■ — | | 1 ■ ■ — I I I 7360 

TATGTATAAACTT AC ATAAATCTTTTTATTTGTTTATCCCCAAGGCGCGTGTAAAGGGGCTTTTCACGGTGGACTGC AGA 
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FIG. 12(cont) 



Page 12 



BspHI 



AAGAAACCATTATTATC ATGAC ATTAACCT ATAAAAATAGGCGTATC ACGAGGCCCTTTCGTCTTCAA 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I I > 7428 

TTCTTTGGTAATAATAGTACTGTAATTGGATATTTTTATCCGC ATAGTGCTCCGGGAAAGC AGAAGTT 




FIG. 13 



FIG 1 4 
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GD2407 pLNBLV-YFP Map.MPD (1>7010) Site and Sequence 
Enzymes : 36 of 538 enzymes (Filtered) 

Settings : Circular, Certain Sites Only, Standard Genetic Code 

GAATT AATTCATACCAGATC ACCGAAAACTGTCCTCC AAATGTGTCCCCCTCACACTCCC AAATTCGCGGGCTTCTGCCT 

I ■ ■ ■ I I I I 1 1 ■ ■ I ■ ■ | ■ .. . i .... 1 I 80 

CTTAATTAAGTATGGTCTAGTGGCTTTTGACAGGAGGTTTACACAGGGGGAGTGTGAGGGTTTAAGCGCCCGAAGACGGA 

Sacll 

CTTAGACC AC TC T ACCCTATTCCCC AC ACTC ACCGGAGCC AAAGCCGCGGCCCTTCCGTTTCTTTGCTTTTGAAAGACCC 

I I ■ ■ ■ ■ I 1 I 1 1 ■ ' ' ■■ 1 1 I I I 160 

GAATCTGGTGAGATGGGATAAGGGGTGTGAGTGGCCTCGGTTTCGGCGCCGGGAAGGCAAAGAAACGAAAACTTTCTGGG 



5' LTR 



5' LTR MoMS ■ 



Nhel 

I 

CACCCGTAGGTGGCAAGCTAGCTTAAGT AACGCC ACTTTGC AAGGCATGGAAAAATAC ATAACTGAGAAT AGAAAAGTTC 

I 1 I I ■ ■ t I ■ ■ ■ ■ I 1 240 

GTGGGCATCC ACCGTTCGATCGAATTC ATTGCGGTGAAACGTTCCGT ACCTTTTT ATGTATTGACTCTT ATCTTTTC AAG 

5' LTR 



•5' LTR MoMSV - 



fVull EcoRV 

AGATC AAGGTCAGGAACAAAGAAACAGCTGAATACCAAAC AGGAT ATCTGTGGTAAGCGGTTCCTGCCCCGGCTC AGGGC 

1 ■ I I ■ ■ ■ ■ I I I I 1 I 320 

TCTAGTTCCAGTCCTTGTTTCTTTGTCGACTTATGGTTTGTCCTATAGACACCATTCGCCAAGGACGGGGCCGAGTCCCG 

5' LTR 



5' LTR MoMSV 

Pvull EcoRV 

S i 

CAAGAACAGATGAGACAGCTGAGTGATGGGCCAAAC AGGAT ATCTGTGGTAAGCAGTTCCTGCCCCGGCTCGGGGCC AAG 
I ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I I I I I I 400 

gttcttgtctactctgtcgactcactacccggtttgtcctatagacaccattcgtc aaggacggggccgagccccggttc 

sTTr 



■5' LTR MoMSV - 



AAC AGATGGTCCCC AGATGCGGTCC AGCCCT.C AGC AGTTTCTAGTGAATCATCAGATGTTTCC AGGGTGCCCCAAGGACC 

I I I I I I I I 480 

TTGTCTACCAGGGGTCTACGCCAGGTCGGGAGTCGTCAAAGATCACTTAGTAGTCTACAAAGGTCCCACGGGGTTCCTGG 



•5' LTR MoMSV - 



TGAAAATGACCCTGTACCTTATTTGAACTAACC AATC AGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCCGCTCTCCGAGC 
I ■ ■ 1 1 I i ■ . . . i . ■ . ■ | ■ ■ ■ , i , , , , | , , , ■ | | i 560 

ACTTTTACTGGGAC ATGGAATAAACTTGATTGGTTAGTC AAGCGAAGAGCGAAGACAAGCGCGCGAAGGCGAGAGGCTCG 

sTTr 



■5' LTR MoMSV- 



Thursday, June 13,2002 3:42 PM FIG. 14 ( CO fit) Page 2 
. GD2407 pLNBLV-YFP Map.MPD (1 > 7010) Site and Sequence 

Sad AscI Smal Kpnl 

I I I 
TCAAT AAAAGAGCCCACAACCCCTCACTCGGCGCGCC AGTCTTCCGAT AGACTGCGTCGCCCGGGT ACCCGTATTCCCAA 
■ ■ ■ ■ > I i ■■■ ■ | ■ ■ ■ ■ | | | ■ ■ ■ ■ | 

AGTTATTTTCTCGGGTGTTGGGGAGTGAGCCGCGCGGTC AGAAGGCTATCTGACGCAGCGGGCCCATGGGC ATAAGGGTT 

5' LTR 



•5' LTR MoMSV - 



TAAAGCCTCTTGCTGTTTGCATCCGAATCGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAGTGATTGACT ACCCAC 

■ ■ ■ ■ I I I I I I I ■ I 720 

ATT TCGGAGAACGACAAACGTAGGCTT AGCACCAGAGCGACAAGGAACCCTCCCAGAGGAGAC TCACTAACTGATGGGTG 

5' LTR 

5' LTR MoMSV 



GACGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATTTGGAGACCCCTGCCCAGGGACCACCGACCC ACCACCGGGAGGTA 

I I I I I ■ ■ ■ ■ i ■ ■ ■ ■ I I I 800 

CTGCCCCCAGAAAGTAAACCCCCGAGCAGGCCCTAAACCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGTGGCCCTCCAT 



5' LTR ^ 
— 5' LTR MoMSV — 1 



Spel 

i 

AGCTGGCCAGCAACTTATCTGTGTCTGTCCGATTGTCTAGTGTCTATGTTTGATGTTATGCGCCTGCGTCTGTACTAGTT 

■ 1 1 ■ ■ ■ ■ i ■ ■ ■ ■ I I 1 1 I 880 

TCGACCGGTCGTTGAATAGAC ACAGAC AGGCT AAC AGATCACAGATAC AAACTAC AATACGCGGACGC AGACATGATCAA 

^ "Pkg Rgn«— 



1 Extended Packaging Region 

AGCT AACTAGC TCTGTATCTGGCGGACCCGTGGTGGAACTGACGAGTTCTGAAC ACCCGGCCGCAACCCTGGGAGACGTC 

1 I ■ ■ ■ ■ 1 I I I I ■ ■ ■ ■ | 960 

TCGATTGATCGAGACATAGACCGCCTGGGCACCACCTTGACTGCTCAAGACTTGTGGGCCGGCGTTGGGACCCTCTGCAG 



'Pkg Rgni 



- Extended Packaging Region ■ 



CCAGGGACTTTGGGGGCCGTTTTTGTGGCCCGACCTGAGGAAGGGAGTCGATGTGGAATCCGACCCCGTCAGGATATGTG 

I I i ■ ■ ■ ■ I I I I 1 1040 

GGTCCCTGAAACCCCCGGCAAAAAC ACCGGGCTGGACTCCTTCCCTC AGCTACACC TTAGGCTGGGGC AGTCCTATAC AC 

^ ^ Pkg Rgn — — — ^ — 

Extended Packaging Region 



GTTCTGGTAGGAGACGAGAACCTAAAACAGTTCCCGCCTCCGTCTGAATTTTTGCTTTCGGTTTGGAACCGAAGCCGCGC 

■ ■ ■ i | . ■ , ■ | I 1 I 1 I 1 120 

CAAGACC ATCCTCTGCTCTTGGATTTTGTCAAGGGCGG AGGCAGACTTAAAAACGAAAGCC AAACCTTGGCTTCGGCGCG 
^ ^ Pkg Rgn— - — — — — - 
Extended Packaging Region 



|Pstl Pstl 

GTCTTGTCTGCTGC AGCGCTGCAGC ATCGTTCTGTGTTGTCTC TGTCTGACTGTGTTTCTGTATTTGTCTGAAAATTAGG 
1 1 1 ■ ■ ■ ■ i ■ ■ ■ ■ I | ■■■■ ) ■■■■ | | ,,,,,,,,, | 12 oo 

CAGAACAGACGACGTCGCGACGTCGTAGC AAGAC AC AAC AGAGAC AGACTGACACAAAGAC AT AAACAGACTTTTAATCC 

Pkg Rgn^ — ^ — 

Extended Packaging Region 



FIG. 14 (cont) 
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GCC AGACTGTTACC ACTCCCTTAAGTTTGACCTT AGGTCACTGGAAAGATGTCGAGCGGATCGCTCAC AACCAGTCGGT A 

I I I I I ' ■ ■ ■ I ■ ■ ■ ■ I I 1280 

CGGTCTGACAATGGTGAGGGAATTCAAACTGGAATCCAGTGACCTTTCTACAGCTCGCCTAGCGAGTGTTGGTCAGCCAT 

Pkg Rgn"^ 



Extended Packaging Region 

Pstl 

I 

GATGTCAAGAAGAGACGTTGGGTT ACCTTCTGCTCTGC AGAATGGCC AACCTTTAACGTCGGATGGCCGCGAGACGGCAC 

I I I i 1 1 ■ ■ ■ ■ i ■ ■ ■ ■ i ' ' ' ■ I 1360 

CTACAGTTCTTCTCTGCAACCC AATGGAAGACGAGACGTCTT ACCGGTTGGAAATTGCAGCCT ACCGGCGCTCTGCCGTG 



'Pkg Rgni 



Extended Packaging Region 

CTTTAACCGAGACCTCATCACCCAGGTTAAGATCAAGGTCTTTTCACCTGGCCCGCATGGACACCCAGACCAGGTCCCCT 

■ ■ ■ ■ i ■ ■ ■ ■ 1 I ■ — I I I I 1 ■ 1 — i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ 1 1440 

GAAATTGGCTCTGGAGTAGTGGGTCC AATTCTAGTTCC AGAAAAGTGGACCGGGCGT ACCTGTGGGTCTGGTCC AGGGGA 



■Pkg Rgn« 



Extended Packaging Region 

ACATCGTGACCTGGGAAGCCTTGGCTTTTGACCCCCCTCCCTGGGTCAAGCCCTTTGTACACCCTAAGCCTCCGCCTCCT 

I I I ■ ■ ■ ■ I I I I I 1520 

TGTAGCACTGGACCCTTCGGAACCGAAAACTGGGGGGAGGGACCC AGTTCGGGAAAC ATGTGGGATTCGGAGGCGGAGGA 



'Pkg Rgn 1 



Extended Packaging Region 

CTTCCTCCATCCGCCCCGTCTCTCCCCCTTGAACCTCCTCGTTCGACCCCGCCTCGATCCTCCCTTTATCCAGCCCTCAC 

■ ■ ■ ■ i ■ ■ ■ ■ I I 1 I I I I I 1600 

GAAGGAGGTAGGCGGGGCAGAGAGGGGGAACTTGGAGGAGCAAGCTGGGGCGGAGCTAGGAGGGAAATAGGTCGGGAGTG 



•Pkg Rgni 



Extended Packaging Region 

Narl f EcoRI Bell 

I I 

TCCTTCTCTAGGCGCCGGAATTCCGATCTGATCAAGAG ACAGGATGAGGATCGTTTCGC ATGATTGAACAAGATGGATTG 

I I I I 'I 1 I ■ ■ ■ ■ I 1680 

AGGAAGAGATCCGCGGCCTTAAGGCT AGACTAGTTCTC TGTCCTACTCCTAGC AAAGCGTACT AACTTGTTCTACCT AAC 

Pkg Rgn"""| Met lie Glu Gin Asp Gly Leu 

■ Extended Packaging I Neomycin Phosphotransfer- 

CACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGC 

I I I I ■ ■ ■ ■ 1 I | ■ ... i , ... | 176O 

GTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGAT AAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTACG 

His Ala Gly Ser Pro Ala Ala Trp Vol Glu Arg Leu Phe Gly Tyr Asp Trp Ala Gin Gin Thr lie Gly Cys Ser Asp Ala 
Neomycin Phosphotransferase 

Narl 

CGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGC 
I I I — I I 1 1 1 * i 1 1 1 1 I I I 1840 

GCGGCAC AAGGCCGACAGTCGCGTCCCCGCGGGCCAAGAAAAAC AGTTCTGGCTGGAC AGGCC ACGGGACTTACTTGACG 



Ala Vol Phe Arg Leu Ser Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala Leu Asn Glu Leu 
Neomycin Phosphotransferase 



FIG. 14(cont) 
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Pstl Fspl Pvull 

i i 

AGGACGAGGCAGCGCGGCT ATCGTGGCTGGCC ACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGTC ACTGAAGCG 

■ ■ ■ ■ I I I I I 1 ■ 1 1 I I ■ 1 1 ■ I 1920 

TCCTGC TCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGC AAC AGTGACTTCGC 

Gin Asp Glu Alo Alo Arg Leu Ser Trp Leu Ala Thr Thr Gly Vol Pro Cys Ala Ala Val Leu Asp Val Vol Thr Glu Ala 
Neomycin Phosphotransferase 



GGAAGGGACTGGCTGCT ATTGGGCGAAGTGCCGGGGC AGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGT ATC 

I I I I I I ■ ' ' ' i ■ ■ ■ ■ I I 2000 

CCTTCCCTGACCGACGATAACCCGCTTCACGGCCCCGTCCTAGAGGACAGTAGAGTGGAACGAGGAGGGCTCTTTCATAG 

Gly Arg Asp Trp Leu Leu Leu Gly Glu Vol Pro Gly Gin Asp Leu Leu Ser Ser His Leu Ala Pro Ala Glu Lys Val Ser 
Neomycin Phosphotransferase 

C ATC ATGGCTGATGCAATGCGGCGGCTGC AT ACGCTTGATCCGGCT ACCTGCCC ATTCGACCACCAAGCGAAACATCGC A 

I I I ■ ■ ■ ■ I I I ■ ■ ■ ■ | ■ ... i .... | 2080 

GTAGTACCGACTACGTT ACGCCGCCGACGTATGCGAACT AGGCCGATGGACGGGTAAGCTGGTGGTTCGCTTTGTAGCGT 

He Met Ala Asp Ala Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin Ala Lys His Arg 
Neomycin Phosphotransferase 



TCGAGCGAGC ACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACGAAGAGC ATC AGGGGCTCGCGCC A 
I I I — ■ 1 — H- — ■ — 1 I 1 2160 

AGCTCGCTCGTGCATGAGCCTACCTTCGGCCAGAACAGCTAGTCCTACTAGACCTGCTTCTCGTAGTCCCCGAGCGCGGT 

He Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro 
Neomycin Phosphotransferase 



Sphl Ncol 

I I 

GCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCC 

I I | ... ■ | i | \ | 2240 

CGGC TTGACAAGCGGTCCGAGTTCCGCGCGT ACGGGCTGCCGCTCCTAGAGC AGC ACTGGGT ACCGCTACGGACGAACGG 

Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu Val Val Thr His Gly Asp Ala Cys Leu Pro 
Neomycin Phosphotransferase 

JMael 

GAAT ATCATGGTGGAAAATGGCCGCTTTTCTGGATTC ATCGACTGTGGCCGGCTGGGTGTGGCGGACCGC TATCAGGAC A 

■ ■ ■ ■ I I ■ ... i ■ ■ ■ ■ l ■ ■ ■ ■ I I I I I 2320 

CTTATAGTACCACCTTTTACCGGCGAAAAGACCTAAGTAGCTGACACCGGCCGACCCACACCGCCTGGCGATAGTCCTGT 

Asn He Met Vol Glu Asn Gly Arg Phe Ser Gly Phe He Asp Cys Gly Arg Leu Gly Vol Ala Asp Arg Tyr Gin Asp 
Neomycin Phosphotransferase 

TAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCC 

I I ■ ■ ■ ■ 1 I I I I i ■ ■ ■ ■ 1 2400 

ATCGCAACCGATGGGCACT ATAACGACTTCTCGAACCGCCGCTT ACCCGACTGGCGAAGGAGCACGAAATGCC AT AGCGG 

lie Ala Leu Ala Thr Arg Asp He Ala Glu Glu Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly He Ala 
Neomycin Phosphotransferase 



GC TCCCGATTCGCAGCGC ATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACC 

I I I I ■ ■ ■ — 1 1 ■ ■ ■ — i- — — - H I 2480 

CG AGGGCTAAGCGTCGCGTAGCGGAAGAT AGCGGAAGAACTGCTCAAGAAGACTCGCCCTGAGACCCCAAGCTTTACTGG 



Ala Pro Asp Ser Gin Arg He Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe • 
Neomycin Phosphotransferase 
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GACC AAGCGACGCCCAACCTGCCATC ACGAGATTTCGATTCC ACCGCCGCC TTCTATGAAAGGTTGGGCTTCGGAATCGT 

I I ■ ' ■ ■ I ■ ■ ■ ' | ■ ... i ■ ... | | | . ■ . ■ i 2560 

CTGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCA 



Nael Smal 

! ! 

TTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCGGGCTCGATCCCC 

I I \ I I ■ ■ ■ ■ i ■ ■ ■ ■ I I I 2640 

AAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCT AGAGTACGACCTCAAGAAGCGGGTGGGGCCCGAGCT AGGGG 



Nrul Pvull 

TCGCGAGTTGGTTCAGCTGCTGCCTGAGGCTGGACGACCTCGCGGAGTTCTACCGGCAGTGCAAATCCGTCGGCATCCAG 

I I I I I ■ ■ ■ ■ I I I 2720 

AGCGCTCAACCAAGTCGACGACGGACTCCGACC TGCTGGAGCGCCTC AAGATGGCCGTCACGTTTAGGC AGCCGT AGGTC 



Pstl 

GAAACC AGCAGCGGCTATCCGCGC ATCC ATGCCCCCGAACTGCAGGAGTGGGGAGGCACGATGGCCGCTTTGGTCGAGGC 

I I I I I I I ■ ■ ■ ■ I 2800 

CTTTGGTCGTCGCCGATAGGCGCGTAGGTACGGGGGCTTGACGTCCTCACCCCTCCGTGCTACCGGCGAAACCAGCTCCG 



Bam HI 

GGATCCTAGC AGAAAAATAAGACTTGATTCCCCCTTAAAATT ACAACTGCT AGAAAATGAATGGCTCTCCCGCCTTTTTT 

I I | .... i .... | | | | | 2880 

CCTAGGATCGTCTTTTTATTCTGAACT AAGGGGGAATTTTAATGTTGACGATCTTTTACTTACCGAGAGGGCGGAAAAAA 

BLV Pro 

BLV Promoter 



Narl Pvull 

GAGGGGGAATCATTTGTATGAAAGATC ATGCCGACCTAGGCGCCGCC ACCGCCCCGTAAACC AGACAGAGACGTC AGCTG 
1 | .... i .... | | i i . , | | 2960 

CTCCCCCTTAGTAAACATACTTTCTAGTACGGCTGGATCCGCGGCGGTGGCGGGGCATTTGGTCTGTCTCTGCAGTCGAC 

BLV Pro 



•BLV Promoter - 



Pvull 

I 

CCAGAAAAGCTGGTGACGGC AGCTGGTGGCT AGAATCCCCGT ACCTCCCC AACTTCCCCTTTCCCGAAAAATCC AC ACCC 

■ ■ ■ ■ i ■ ■ ■ ■ | | ■ ... i ■ ... | , ... i ■■■■ | 1 1 1 I 3040 

GGTCTTTTCGACCACTGCCGTCGACCACCGATCTTAGGGGCATGGAGGGGTTGAAGGGGAAAGGGCTTTTTAGGTGTGGG 

BLV Pro 



■BLV Promoter - 



Nael 

I 

TGAGCTGCTGACCTCACCTGCTGATAAATTAATAAAATGCCGGCCCTGTCGAGTTAGCGGCACCAGAAGCGTTCTTCTCC 
I ■ ... | .... i ■ ... | i . ■ ■ ■ | | i ■ ■ ■ ■ i 3120 

ACTCGACGACTGGAGTGGACGACTATTT AATTATTTTACGGCCGGGAC AGCTC AATCGCCGTGGTCTTCGC AAGAAGAGG 

BLV Pro 



BLV Promoter 
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Xhol Hindlll 

TGAGACCCTCGTGCTCAGCTCTCGGTCCTGCCTCGAGAAGCTTGTT ATC AC AAGTTTGTACAAAAAAGC AGGCTTCGAAG 
I i , ... t .... | .... t .... | ■ , , ■ i ■ ■ . ■ | | I 3200 

ACTCTGGGAGCACGAGTCGAGAGCCAGGACGGAGCTCTTCGAACAATAGTGTTCAAACATGTTTTTTCGTCCGAAGCTTC 



BLV Pro 



• BLV Promoter 1 1 attBI 



Ncol 

i 

! 

GAGATAGAACCAATTCTCT AAGGAAATACTTAACC ATGGTGAGCAAGGGCGAGGAGCTGTTC ACCGGGGTGGTGCCC ATC 

■ ■ ■ ■ 1 I ■ ' ' ■ I I ■ ' ' ' i I I I 3280 

CTCTATCTTGGTTAAGAGATTCCTTTATGAATTGGTACCACTCGTTCCCGCTCCTCGACAAGTGGCCCCACCACGGGTAG 

.Met Vol Ser Lys Gly Glu Glu Leu Phe Thr Gly Vol Vol Pro He 
I EYFP 



CTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAA 

I ■ ■ ■ ■ 1 | .... i .... | | 1 ■ ■ ■ I 1 ■ ■ ■ ■ i ■ ■ ■ ■ I 3360 

GACC AGCTCGACCTGCCGC TGCATTTGCCGGTGTTCAAGTCGCACAGGCCGCTCCCGCTCCCGCT ACGGTGGATGCCGTT 

Leu Vol Glu Leu Asp Gly Asp Vol Asn Gly His Lys Phe Ser Vol Ser Gly Glu Gly Glu Gly Asp Alo Thr Tyr Gly Lys 
EYFP 



GCTGACCCTGAAGTTC ATCTGC ACC ACCGGC AAGCTGCCCGTGCCCTGGCCC ACCCTCGTGACC ACCTTCGGCT ACGGCC 
I .... i .... I ■ | I I I I I 3440 

CGACTGGGAC TTCAAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGGTGGGAGCACTGGTGGAAGCCGATGCCGG 

Leu Thr Leu Lys Phe lie Cys Thr Thr Gly Lys Leu Pro Vol Pro Trp Pro Thr Leu Vol Thr Thr Phe Gly Tyr Gly 
EYFP 



Pstl 

TGCAGTGCTTCGCCCGCTACCCCGACC AC ATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCT ACGTCC AG 

1 ■ ■ ■ ■ I I I ■ ■ — i ■ ■ ■ ■ i 1 | .... i .... | 3520 

ACGTCACGAAGCGGGCGATGGGGCTGGTGT ACTTCGTCGTGCTGAAGAAGTTC AGGCGGTACGGGCTTCCGATGC AGGTC 

Leu Gin Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gin His Asp Phe Phe Lys Ser Alo Met Pro Glu Gly Tyr Vol Gin 
EYFP 



GAGCGC ACCATCTTCTTC AAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAA 

1 I I ■ ■ — h — ■ ■ ■ I 1 i .... i ■ ... | | 3600 

CTCGCGTGGT AGAAGAAGTTCCTGCTGCCGTTGATGTTCTGGGCGCGGCTCCACTTCAAGCTCCCGCTGTGGGACC ACTT 

Glu Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Alo Glu Vol Lys Phe Glu Gly Asp Thr Leu Vol Asn 
EYFP 



CCGC ATCGAGCTGAAGGGC ATCGAC TTC AAGGAGGACGGCAAC ATCCTGGGGCACAAGCTGGAGTACAACTAC AACAGCC 

■ 1 ■ ■ I I I 1 I I I I 3680 

GGCGTAGCTCGACTTCCCGT AGCTGAAGTTCCTCCTGCCGTTGTAGGACCCCGTGTTCGACCTC ATGTTGATGTTGTCGG 

Arg lie Glu Leu Lys Gly lie Asp Phe Lys Glu Asp Gly Asn He Leu Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser 



ACAACGTCTATATC ATGGCCGAC AAGC AGAAGAACGGC ATC AAGGTGAACTTC AAGATCCGCC ACAAC ATCGAGGACGGC 

■ ■ ■ ■ i ■ • ■ ■ I I | .... i .... | | ■ ■ . ■ | ■ ■ I ■ ■ I 3760 

TGTTGC AGATAT AGTACCGGCTGTTCGTCTTCTTGCCGTAGTTCCACTTGAAGTTCTAGGCGGTGTTGTAGCTCCTGCCG 



His Asn Vol Tyr lie Met Ala Asp Lys Gin Lys Asn Gly He Lys Vol Asn Phe Lys He Arg His Asn He Glu Asp Gly 
EYFP 
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AGCGTGCAGCTCGCCGACCACTACC AGC AGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACC ACT ACCT 
I I | .... t .... | ■ ■ ■ ■ | , ... | .... i ■ ... | ... ■ | 3840 

TCGCACGTCGAGCGGCTGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGCCGGGGCACGACGACGGGCTGTTGGTGATGGA 

Ser Vol Gin Leu Alo Asp His Tyr Gin Gin Asn Thr Pro lie Gly Asp Gly Pro Vol Leu Leu Pro Asp Asn His Tyr Leu 
EYFP 



GAGCTACCAGTCCGCCCTGAGC AAAGACCCCAACGAGAAGCGCGATCAC ATGGTCCTGCTGGAGTTCGTGACCGCCGCCG 
I , ... i ■ ■■■ | | | | | | ■ ■ ■ ■ | 3920 

CTCGATGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTGT ACC AGGACGACCTCAAGC ACTGGCGGCGGC 

Ser Tyr Gin Ser Alo Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Vol Leu Leu Glu Phe Vol Thr Alo Alo 
EYFP 



Notl Xhol 



EcoRV 
Xbal 



GGATCACTCTCGGCATGGACGAGCTGTACAAGTAAAGCGGCCGCACTCGAGATATCTAGACCC AGCTTTCTTGTACAAAG 

■ ■ ■ ■ I ' ■ ■ ■ I I ■ ■ ■ ■ I i .... i .... | | | £|000 

CCTAGTGAGAGCCGTACCTGCTCGAC ATGTTC ATTTCGCCGGCGTGAGCTCTATAGATCTGGGTCGAAAGAAC ATGTTTC 



Gly He Thr Leu Gly Met Asp Glu Leu Tyr Lys 
EYFP 



attB2 



■attB2- 



Clal 



TGGTGAT AACATCGATAAAATAAAAGATTTTATTTAGTCTCCAGAAAAAGGGGGGAATGAAAGACCCCACCTGTAGGTTT 

I I I I I I I I 4080 

ACCACTATTGTAGCTATTTTATTTTCTAAAAT AAATCAGAGGTCTTTTTCCCCCCTT ACTTTC TGGGGTGGAC ATCC AAA 

att I 
•attBJ 



3' LTR 



•3' LTR MoMLV- 



Nhel 



GGC AAGCTAGCTT AAGTAACGCCATTTTGC AAGGCATGGAAAAATAC AT AACTGAGAATAG AGAAGTTC AGATCAAGGTC 

I I I I ■ 1 1 1 I | ■ ... i .... | ■ ... i ■ ... | 4)60 

CCGTTCGATCGAATTCATTGCGGTAAAACGTTCCGTACCTTTTTATGTATTGACTCTTATCTCTTCAAGTCTAGTTCCAG 

3' LTR 



•3' LTR MoMLV- 



Pvull jEcoRV 

AGGAACAGATGGAAC AGCTGAATATGGGCC AAAC AGGATATCTGTGGT AAGCAGTTCCTGCCCCGGCTC AGGGCC AAGAA 

I I I I I — — ■ ' | — H- 4240 

TCCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGGGGCCGAGTCCCGGTTCTT 

3' LTR 



■3' LTR MoMLV- 



Pvull EcoRV 

CAGATGGAAC AGCTGAATATGGGCC AAAC AGGATATCTGTGGTAAGCAGTTCCTGCCCCGGCTC AGGGCC AAGAACAGAT 
I ■ ■ ■ ■ I I I I I | .... i | 4320 

GTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGGGGCCGAGTCCCGGTTCTTGTCTA 

__ 



•3' LTR MoMLV- 



FIG. 14 (cont) 
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Xbal 

GGTCCCC AGATGCGGTCCAGCCCTC AGCAGTTTCTAGAGAACCATC AGATGTTTCCAGGGTGCCCC AAGGACCTGAAATG 

— — — ■ ■ | — — H I ■ 1 — I I I I' — I 4400 

CCAGGGGTCTACGCCAGGTCGGGAGTCGTC AAAGATCTCTTGGT AGTCTACAAAGGTCCC ACGGGGTTCCTGGACTTT AC 

3' LTR 

3' LTR MoMLV 



Sad 

ACCCTGTGCCTTATTTGAACTAACC AATC AGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCTGCTCCCCGAGCTC AATAAA 

I 1 I I I I I I 4480 

TGGGAC ACGGAATAAACTTGATTGGTT AGTC AAGCGAAGAGCGAAGAC AAGCGCGCGAAGACGAGGGGCTCGAGTTATTT 

3' LTR 



•3' LTR MoMLV - 



Narl Smal Kpnl 

AGAGCCC AC AACCCCTC ACTCGGGGCGCC AGTCCTCCGATTGACTGAGTCGCCCGGGT ACCCGTGTATCC AATAAACCCT 

I I I I I I ■ ■ ■ ■ i ■ ■ ■ ■ 1 1 4560 

TCTCGGGTGTTGGGGAGTGAGCCCCGCGGTCAGGAGGCTAACTGACTCAGCGGGCCCATGGGCACATAGGTTATTTGGGA 

3' LTR 



•3' LTR MoMLV - 



CTTGCAGTTGCATCCGACTTGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAGTGATTGACTACCCGTCAGCGGGGG 
I I 1 I I 1 I I 4640 

GAACGTC AACGTAGGCTGAAC ACCAGAGCGAC AAGGAACCCTCCC AGAGGAGACTC ACTAACTGATGGGC AGTCGCCCCC 

3' LTR 

3' LTR MoMLV 



TCTTTCATTTGGGGGCTCGTCCGGGATCGGGAGACCCCTGCCCAGGGACCACCGACCCACCACCGGGAGGTAAGCTGGCT 

I 1 I I 1 I I I 4720 

AGAAAGTAAACCCCCGAGCAGGCCCTAGCCCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGTGGCCCTCCATTCGACCGA 



3' LTR ^ 
-3' LTR Mo J 



GCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTC AC AGCTTGTCTGTAAGCG 

■ ' ■ ■ i ■ ■ ■ ■ 1 ■ ■ ■ ■ I 1 I ■ ■ ■ ■ I I I ■ ■ ■ ■ i ■ ■ ■ ■ I 4800 

CGGAGCGCGC AAAGCCACTACTGCC ACTTTTGGAGACTGTGTACGTCGAGGGCCTCTGCC AGTGTCGAAC AGAC ATTCGC 



GATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCGCAGCCATGACCCAGTCACG 

■ ■ ■ ■ i ■ ■ ■ ' I ■ ■ ■ ■ I I I ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I 1 I 4880 

CTACGGCCCTCGTCTGTTCGGGCAGTCCCGCGCAGTCGCCCACAACCGCCCACAGCCCCGCGTCGGTACTGGGTCAGTGC 



Ndel 

! 

TAGCGATAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATGCGGTGTGA 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I | .... i .... | | 4960 

ATCGCTATCGCCTCACATATGACCGAATTGATACGCCGTAGTCTCGTCTAACATGACTCTCACGTGGTATACGCCACACT 



AATACCGC AC AGATGCGT AAGGAGAAAATACCGC ATCAGGCGCTCTTCCGCTTCCTCGCTC ACTGACTCGCTGCGCTCGG 
■ ... 1 ... ■ i | 1 ■ ■ ■ ■ I 1 I ■ ■ ■ ■ I 5040 

TTATGGCGTGTCTACGCATTCCTCTTTT ATGGCGTAGTCCGCGAGAAGGCGAAGGAGCGAGTGACTGAGCGACGCGAGCC 
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TCGTTCGGCTGCGGCGAGCGGTATCAGCTC ACTCAAAGGCGGTAATACGGTTATCC ACAGAATCAGGGGATAACGC AGGA 

I ■ ... i .... I I | | — 'I ■ I ■ ■ ■ ■ I 5120 

AGCAAGCCGACGCCGCTCGCCATAGTCGAGTGAGTTTCCGCCATTATGCCAATAGGTGTCTTAGTCCCCTATTGCGTCCT 

AAGAAC ATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCC AT AGGCTCCG 

m | | 1 1 ■ 1 — I 1 1 — ■ i ■ ■ ■ ■ | ■ ■ ■ | I 5200 

TTCTTGTACACTCGTTTTCCGGTCGTTTTCCGGTCCTTGGCATTTTTCCGGCGCAACGACCGCAAAAAGGTATCCGAGGC 



CCCCCCTGACGAGC ATC AC AAAAATCGACGCTCAAGTC AGAGGTGGCGAAACCCGAC AGG ACTAT AAAGATACC AGGCGT 

1 I I I 1 I I I 5280 

GGGGGGACTGCTCGTAGTGTTTTTAGCTGCGAGTTCAGTCTCCACCGCTTTGGGCTGTCCTGATATTTCTATGGTCCGCA 

TTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCG 

■ ■■■ i ■■■■ I I I ■ ... | .... i ■ ... | i | | 536O 

AAGGGGGACCTTCGAGGGAGC ACGCGAGAGGACAAGGCTGGGACGGCGAATGGCCTATGGACAGGCGGAAAGAGGGAAGC 

GGAAGCGTGGCGCTTTCTC ATAGCTC ACGC TGT AGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGT 

■ ■ ■ ■ i ■ ■ ■ ■ 1 I I 1 I I 1 ' ■ ' ■ ■ ' ■ ' ■ 1 5440 

CCTTCGC ACCGCGAAAGAGT ATCGAGTGCG AC ATCCAT AGAGTCAAGCCACATCCAGC AAGCGAGGTTCGACCCGACACA 

GCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACT 

I I 1 I | .... i .... | | — ~-H- 5520 

CGTGCTTGGGGGGCAAGTCGGGCTGGCGACGCGGAATAGGCCATTGATAGCAGAACTCAGGTTGGGCCATTCTGTGCTGA 

TATCGCC AC TGGC AGCAGCC ACTGGTAACAGGATTAGCAGAGCG AGGTATGT AGGCGGTGCT ACAGAGTTCTTGAAGTGG 

I ■ ■ 1 — ■ — H — ■ H — — +h — ~+h — I I I I 5600 

ATAGCGGTGACCGTCGTCGGTGACCATTGTCCTAATCGTCTCGCTCCATACATCCGCCACGATGTCTCAAGAACTTCACC 

TGGCCTAACTACGGC TACACT AGAAGGACAGT ATTTGGTATCTGCGCTCTGCTGAAGCC AGTTACCTTCGGAAAAAGAGT 
1 I | , ... i .... | | i I I 5680 

ACCGGATTGATGCCGATGTGATCTTCCTGTC AT AAACC ATAGACGCGAGACGACTTCGGTCAATGGAAGCCTTTTTCTCA 

TGGT AGC TCTTGATCCGGC AAAC AAACC ACCGCTGGT AGCGGTGGTTTTTTTGTTTGC AAGC AGC AGATTACGCGC AGAA 
■ ... I ■ ... i ■ ... 1 | | | | ■ ■ , . i | 576O 

ACCATCGAG AACT AGGCCGTTTGTTTGGTGGCGACC ATCGCCACC AAAAAAACAAACGTTCGTCGTCTAATGCGCGTCTT 

AAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTC ACGTTAAGGGATT 

I ■ ■ ■ ■ I ■ ■ ' ' 1 I I I I ' ■ ' ■ I 5840 

TTTTTCCTAGAGTTCTTCTAGGAAAC TAGAAAAGATGCCCCAGACTGCGAGTCACCTTGCTTTTGAGTGC AATTCCCTAA 

BspHI Dral Dral 

| I 

TTGGTC ATGAGATTATCAAAAAGGATCTTC ACCTAG ATCCTTTT AAATTAAAAATGAAGTTTT AAATCAATCTAAAGTAT 

I I ■ ■ ' ■ I ■ ■ ■ ■ I 1 I I 1 ■ ■ ■ I 5920 

AACC AGTACTCTAAT AGTTTTTCCTAGAAGTGGATCT AGGAAAATTTAATTTTTACTTC AAAATTTAGTTAGATTTCATA 

ATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCAT 

I 'I — -I I I ■ ■ ■ I I — H- 6000 

TATACTCATTTGAACCAGACTGTC AATGGTTACGAATTAGTCACTCCGTGGATAGAGTCGCTAGACAGATAAAGCAAGTA 



. • Trp His Lys He Leu Ser Ala Gly lie Glu Alo He Gin Arg Asn Arg Glu Asp 
1 b-Lactamse 



Thursday, June 13, 2002 3:42 PM FIG- 14 (COIlt) p age 10 
GD2407 pLNBLV-YFP Map.MPD (1>7010) Site and Sequence 

CCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGAT ACGGGAGGGCTTACC ATCTGGCCCC AGTGCTGC AATGAT A 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I | .... i .... | | , — I 6080 

GGTATC AACGGACTGAGGGGCAGC AC ATCTATTGATGC TATGCCCTCCCGAATGGTAGACCGGGGTC ACGACGTTACTAT 

Met Thr Ala Gin Ser Gly Thr Thr Tyr He Vol Vol He Arg Ser Pro Lys Gly Asp Pro Gly Leu Ala Ala ile He 
b-Lactamse 



CCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGC AATAAACC AGCCAGCCGGAAGGGCCGAGCGC AGAAGTGGTCC 

\ I I ■ ■ ■ ■ I I I I I 6160 

GGCGCTCTGGGTGCGAGTGGCCGAGGTCTAAATAGTCGTTATTTGGTCGGTCGGCCTTCCCGGCTCGCGTCTTCACCAGG 

Gly Arg Ser Gly Arg Glu Gly Ala Gly Ser Lys Asp Ala Ile Phe Trp Gly Ala Pro Leu Ala Ser Arg Leu Leu Pro Gly 
b-Lactamse 



fspl 

TGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGC 

1 1 1 ■ i ■ ■ ■ * I I I I I I I I 6240 

ACGTTGAAATAGGCGGAGGT AGGTC AGATAATT AACAACGGCCCTTCGATCTC ATTCATCAAGCGGTCAATTATCAAACG 

Ala Val Lys Asp Ala Glu Met Trp Asp lie Leu Gin Gin Arg Ser Ala Leu Thr Leu Leu Glu Gly Thr Leu Leu Lys Arg 
b-Lactamse 



Pstl 

GCAACGTTGTTGCCATTGCTGCAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAA 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ 1 | | i | i | 6320 

CGTTGC AAC AACGGT AACGACGTCCGTAGCACC AC AGTGCGAGCAGC AAACC ATACCGAAGTAAGTCGAGGCC AAGGGTT 

Leu Thr Thr Ala Met Ala Ala Pro Met Thr Thr Asp Arg Glu Asp Asn Pro Ile Ala Glu Asn Leu Glu Pro Glu Trp 
b-Lactamse 



Pvul 

I 

CGATCAAGGCGAGTT AC ATGATCCCCCATGTTGTGC AAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAG 

I I I I I ■ ' ■ ■ I I I 6400 

GCTAGTTCCGCTC AATGTACTAGGGGGT ACAAC ACGTTTTTTCGCC AATCGAGGAAGCC AGGAGGCTAGCAAC AGTCTTC 

Arg Asp Leu Arg Thr Vol His Asp Gly Met Asn His Leu Phe Ala Thr Leu Glu Lys Pro Gly Gly Ile Thr Thr Leu Leu 
b-Lactamse 



TAAGTTGGCCGC AGTGTTATCACTCATGGTTATGGC AGC ACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCT 

■ ' ' ' ■ I 1 I I ' ■ ■ ' I I I I 6480 

ATTCAACCGGCGTCACAATAGTGAGTACCAATACCGTCGTGACGTATTAAGAGAATGACAGTACGGTAGGCATTCTACGA 

Leu Asn Ala Ala Thr Asn Asp Ser Met Thr Ile Ala Ala Ser Cys Leu Glu Arg Val Thr Met Gly Asp Thr Leu His Lys 
b-Lactamse 



TTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCA 

I I I I ■ ■ ■ ■ I I ■ ■ ■ ■ I 1 6560 

AAAGAC ACTGACC ACTC ATGAGTTGGTTC AGTAAGACTCTT ATCAC ATACGCCGCTGGCTCAACGAGAACGGGCCGCAGT 

Glu Thr Vol Pro Ser Tyr Glu Vol Leu Asp Asn Gin Ser Tyr His Ile Arg Arg Gly Leu Gin Glu Gin Gly Ala Asp 
b-Lactamse 

Dral 

i 

ACACGGGAT AATACCGCGCC AC AT AGCAGAACTTTAAAAGTGCTC ATCATTGGAAAACGTTCTTCGGGGCGAAAAC TC TC 

I 1 I 1 1 1 1 I I 1 I ■ ■ ■ ■ I 6640 

TGTGCCCTATTATGGCGCGGTGTATCGTCTTGAAATTTTC ACGAGTAGTAACCTTTTGCAAGAAGCCCCGCTTTTG AGAG 

Vol Arg Ser Leu Val Ala Gly Cys Leu Leu Val Lys Phe Thr Ser Met Met Pro Phe Arg Glu Glu Pro Arg Phe Ser Glu 
b-Lactamse 



FIG. 14 (cont) p 
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GD2407 pLNBLV-YFP Map.MPD (1 > 7010) Site and Sequence 

AAGGATCTTACCGCTGTTGAGATCC AGTTCGATGTAACCC ACTCGTGCACCC AACTGATCTTC AGCATCTTTT ACTTTCA 

■ ■ ■ ■ I I | i | | l I 6720 

TTCCTAGAATGGCGACAACTCTAGGTCAAGC TAC ATTGGGTGAGCACGTGGGTTGACTAGAAGTCGTAGAAAATGAAAGT 

Leu lie Lys Gly Ser Asn Leu Asp Leu Glu He Tyr Gly Vol Arg Alo Gly Leu Gin Asp Glu Ala Asp Lys Vol Lys Vol 
b-Lactamse 



CC AGCGTTTCTGGGTGAGCAAAAAC AGGAAGGCAAAATGCCGCAAAAAAGGGAAT AAGGGCGACACGGAAATGTTGAAT A 

I ■ ■ ■ ■ I ■ ■ ■ ■ I I I I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ — I 6800 

GGTCGCAAAGACCC ACTCGTTTTTGTCCTTCCGTTTTACGGCGTTTTTTCCCTTATTCCCGCTGTGCCTTTAC AACTTAT 

Leu Thr Glu Pro His Alo Phe Vol Pro Leu Cys Phe Alo Ala Phe Phe Pro He Leu Alo Vol Arg Phe His Gin He 
b-Lactamse 



BspHI 

CTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTAT 

I ■ ... i .... I I ■ ■ ■ ■ I I I ■ ■ ■ ■ I I 6880 

GAGTATGAGAAGGAAAAAGTTAT AATAACTTCGT AAATAGTCCC AATAAC AG AGTACTCGCCTATGTATAAAC TT AC ATA 

Ser Met, 
-b-LacJ 



BspHI 

I 

TTAGAAAAATAAACAAAT AGGGGTTCCGCGC ACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACC ATT ATTATCA 

I — ■ 1 I ■ ■ ■ — i ■ ■ I I ■ ■ ■ ■ I i .... i ■ ... | 6960 

AATCTTTTT ATTTGTTT ATCCCC AAGGCGCGTGT AAAGGGGCTTTTC ACGGTGGACTGCAGATTCTTTGGTAATAATAGT 



TGACATTAACCTATAAAAAT AGGCGTATC ACGAGGCCCTTTCGTCTTCAA 

" ■ I I I I > 7010 

AC TGTAATTGGATATTTTTATCCGC ATAGTGCTCCGGG AAAGC AGAAGTT 




FIG. 15 



FIG. 16 
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pLNHiv-YFP Map.MPD (1 > 71 21 ) Site and Sequence 
Enzymes : 36 of 538 enzymes (Filtered) 

Settings : Circular, Certain Sites Only, Standard Genetic Code 

GAATTAATTCATACC AGATC ACCGAAAACTGTCCTCC AAATGTGTCCCCCTCAC ACTCCC AAATTCGCGGGCTTCTGCCT 

I I I I | .... i .... | ... ■ | | | 80 

CTT AATTAAGTATGGTCTAGTGGCTTTTGAC AGGAGGTTT ACACAGGGGGAGTGTGAGGGTTTAAGCGCCCGAAGACGGA 

.Sacll 

CTTAGACCACTCTACCCT ATTCCCCACACTC ACCGGAGCCAAAGCCGCGGCCCTTCCGTTTCTTTGCTTTTGAAAGACCC 

I I ■ 1 I ■ ■ 1 ' I I I I I 160 

GAATCTGGTGAGATGGGATAAGGGGTGTGAGTGGCCTCGGTTTCGGCGCCGGGAAGGCAAAGAAACGAAAACTTTCTGGG 



I 5* LTR— = 



Nhel 



CACCCGTAGGTGGCAAGCTAGCTTAAGTAACGCCACTTTGCAAGGCATGGAAAAATACATAACTGAGAATAGAAAAGTTC 

■ I 1 — — h — ■ — H~ — h + h — 1 ■ I I — 1 I I I 240 

GTGGGCATCCACCGTTCGATCGAATTCATTGCGGTGAAACGTTCCGTACCTTTTTATGTATTGACTCTTATCTTTTCAAG 



•5' LTR- 



Pvull EcoRV 

i I 

AGATCAAGGTCAGGAACAAAGAAACAGCTGAAT ACC AAACAGGATATCTGTGGT AAGCGGTTCCTGCCCCGGCTCAGGGC 

I I I I I ... | .... i .... | | 320 

TCTAGTTCC AGTCCTTGTTTCTTTGTCGACTTATGGTTTGTCCT ATAGAC ACC ATTCGCC AAGGACGGGGCCGAGTCCCG 



■5' LTR- 



Pvull EcoRV 

i I 

CAAGAACAGATGAGACAGCTGAGTGATGGGCC AAACAGGATATCTGTGGTAAGC AGTTCCTGCCCCGGCTCGGGGCCAAG 

■ ■ ■ ■ i ' ■ ' ■ I ' ■ ■ ' I I ■ ■ ■ ' 1 I I I I 400 

GTTCTTGTCTACTCTGTCGACTCACTACCCGGTTTGTCCTATAGACACCATTCGTCAAGGACGGGGCCGAGCCCCGGTTC 



■5' LTR- 



AACAGATGGTCCCCAGATGCGGTCCAGCCCTC AGC AGTTTCTAGTGAATC ATCAGATGTTTCC AGGGTGCCCCAAGGACC 

I I I I I I I I 480 

TTGTCTACCAGGGGTCTACGCCAGGTCGGGAGTCGTCAAAGATCACTTAGTAGTCTACAAAGGTCCC ACGGGGTTCCTGG 



■5' LTR- 



TGAAAATGACCCTGTACCTTATTTGAACTAACCAATCAGTTCGCTTCTCGCTTCTGTTCGCGCGCTTCCGC TC TCCGAGC 

I I I ■ ■ ■ ■ I I ■ ■ ■ ■ I I ■ ■ ■ ■ I 560 

ACTTTTACTGGGACATGGAAT AAACTTGATTGGTTAGTC AAGCGAAGAGCGAAGACAAGCGCGCGAAGGCGAGAGGCTCG 



5' LTR 



FIG. 16 (cont) 
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pLNHiv-YFP Map.MPD (1 > 7121) Site and Sequence 

Sad AscI Smal Kpnl 

TCAATAAAAGAGCCC AC AACCCCTCACTCGGCGCGCC AGTCTTCCGATAGACTGCGTCGCCCGGGTACCCGT ATTCCC AA 

■ ■ ■ ■ I I ■ ■ ■ ■ I | . . ■ ■ | i | | 

AGTTATTTTCTCGGGTGTTGGGGAGTGAGCCGCGCGGTCAGAAGGCTATCTGACGCAGCGGGCCCATGGGC ATAAGGGTT 



•5' LTR- 



TAAAGCCTC TTGC TGTTTGC ATCCGAATCGTGGTCTCGCTGTTCCTTGGGAGGGTCTCCTCTGAGTGATTGACTACCCAC 
I I | ■ ... i ■ ■■■ | | | i | 720 

ATTTCGGAGAACGACAAACGT AGGCTTAGC ACCAGAGCGACAAGGAACCCTCCCAGAGGAGACTC ACTAACTGATGGGTG 



•5' LTR- 



GACGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATTTGGAGACCCCTGCCCAGGGACCACCGACCCACCACCGGGAGGTA 

\ I I I I I I i 800 

CTGCCCCC AG AAAGTAAACCCCCGAGCAGGCCCT AAACCTCTGGGGACGGGTCCCTGGTGGCTGGGTGGTGGCCCT.CC AT 



•5' LTR- 



ppel 

AGCTGGCCAGCAACTTATCTGTGTCTGTCCGATTGTCTAGTGTCTATGTTTGATGTTATGCGCCTGCGTCTGTACTAGTT 
I 1 1 ' 1 I ' ■ ' ' I I — 1 — I I ■ ■ ■ — I I 880 

TCGACCGGTCGTTGAAT AGACACAGACAGGCTAACAGATCACAGATACAAACT ACAATACGCGGACGC AGACATGATC AA 



■ Extended Packaging Region • 



AGCTAACTAGCTCTGT ATCTGGCGGACCCGTGGTGGAACTGACGAGTTCTGAACACCCGGCCGCAACCCTGGGAGACGTC 
I ■■■■ | | — ~ — | | | | | 960 

TCGATTGATCGAGACAT AGACCGCCTGGGC ACCACCTTGACTGCTCAAGACTTGTGGGCCGGCGTTGGGACCCTCTGC AG 



• Extended Packaging Region - 



CCAGGGACTTTGGGGGCCGTTTTTGTGGCCCGACCTGAGGAAGGGAGTCGATGTGGAATCCGACCCCGTCAGGATATGTG 
I ■ ! ■ ■ ■ ■ I I j | | 1040 

GGTCCCTGAAACCCCCGGCAAAAACACCGGGCTGGAC TCCTTCCCTC AGCTAC ACCTTAGGCTGGGGC AGTCCTATAC AC 



• Extended Packaging Region • 



GTTCTGGTAGGAGACGAGAACCTAAAACAGTTCCCGCCTCCGTCTGAATTTTTGCTTTCGGTTTGGAACCGAAGCCGCGC 

1 1 11 1 1 1 11 I i I ' 1 • • 1 1 1 1 1 I I I I I 1 120 

CAAGACCATCCTCTGCTCTTGGATTTTGTC AAGGGCGG AGGCAGACTTAAAAACGAAAGCC AAACCTTGGCTTCGGCGCG 



• Extended Packaging Region ■ 



Pstl Pstl 



GTCTTGTCTGCTGC AGCGCTGC AGCATCGTTC TGTGTTGTCTCTGTCTGACTGTGTTTCTGTATTTGTCTGAAAATTAGG 

1 ' 1 1 1 1 11 1 I I I I I ' ■ ■ ■ ' ■ ■ ■ ' I ■ ■ ■ ' i ■ ■ ■ ■ I I 1200 

C AGAAC AGACGACGTCGCGACGTCGT AGCAAGACACAAC AGAGAC AGACTGAC AC AAAGAC ATAAAC AGACTTTTAATCC 



Extended Packaging Region 



FIG. 16 (cont) 
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pLNHiv-YFP Map.MPD (1 > 7121) Site and Sequence 

GCC AGACTGTTACCACTCCCTT AAGTTTGACCTTAGGTCACTGGAAAGATGTCGAGCGGATCGCTCACAACCAGTCGGT A 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I ■ ■ ■ ■ I ■ ■ ■ ■ I ■ ■ ■ ■ I 1280 

CGGTCTGAC AATGGTGAGGGAATTC AAACTGGAATCC AGTGACCTTTCTAC AGCTCGCCTAGCGAGTGTTGGTCAGCCAT 



Extended Packaging Region 

Pstl 

GATGTCAAGAAGAGACGTTGGGTTACCTTCTGCTCTGCAGAATGGCCAACCTTTAACGTCGGATGGCCGCGAGACGGCAC 

I " — I ■ ■ ■ ■ I ■ — I ■ ■ I I I 1 1360 

CTACAGTTCTTCTCTGCAACCC AATGGAAGACGAGACGTC TTACCGGTTGGAAATTGCAGCCT ACCGGCGCTCTGCCGTG 



* Extended Packaging Region ■ 



CTTTAACCGAGACCTCATCACCCAGGTTAAGATCAAGGTCTTTTCACCTGGCCCGCATGGACACCCAGACCAGGTCCCCT 

■ ■ ■ ■ i ■ ■ ■ ■ I | | I I I ■ ■ ■ ■ i » ■ ■ ■ I I 1440 

GAAATTGGCTCTGGAGT AGTGGGTCC AATTCTAGTTCC AGAAAAGTGGACCGGGCGTACCTGTGGGTCTGGTCC AGGGGA 



• Extended Packaging Region • 



ACATCGTGACCTGGGAAGCCTTGGCTTTTGACCCCCCTCCCTGGGTCAAGCCCTTTGTACACCCTAAGCCTCCGCCTCCT 

I 1 ■ ' ' 1 I 1 ■ ' ' ■ I I I 1520 

TGTAGCACTGGACCCTTCGGAACCGAAAACTGGGGGGAGGGACCCAGTTCGGGAAAC ATGTGGGATTCGGAGGCGGAGGA 



■ Extended Packaging Region • 



CTTCCTCCATCCGCCCCGTCTCTCCCCCTTGAACCTCCTCGTTCGACCCCGCCTCGATCCTCCCTTTATCCAGCCCTCAC 

I I I 1 ' ■ ' ■ i ■ ' ■ ■ I I I ■ i ■■■ ■ [ 1600 

GAAGGAGGTAGGCGGGGCAGAGAGGGGGAACTTGGAGGAGCAAGCTGGGGCGGAGCTAGGAGGGAAATAGGTCGGGAGTG 



Extended Packaging Region 

Narl EcoRI Bell 

I I 

TCCTTCTCTAGGCGCCGGAATTCCGATCTGATCAAGAGACAGGATGAGGATCGTTTCGC ATGATTGAACAAGATGGATTG 

■ ' ■ ■ | ■ ... i ■ ... | | | | | | , . ■ | 1680 

AGGAAGAGATCCGCGGCCTT AAGGCT AGACTAGTTCTC TGTCCT ACTCCTAGC AAAGCGTACT AACTTGTTCTACCTAAC 

^^^^^^^^H Met lie Glu Gin Asp Gly Leu 

• Extended Packaging ' NEO 

CACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGC 

■ 1 I ■ ■ ■ ■ i ■ ■ ■ ■ I 1 1 1 I ■ ■ 1 ■ ' ■ 1 ■ ■ 1 1760 

GTGCGTCCAAGAGGCCGGCGAACCCACCTCTCCGATAAGCCGATACTGACCCGTGTTGTCTGTTAGCCGACGAGACTACG 

His Ala Gly Ser Pro Alo Ala Trp Val Glu Arg Leu Phe Gly Tyr Asp Trp Ala Gin Gin Thr He Gly Cys Ser Asp Ala 
NEO 



Narl 

CGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGC 

■ ■ ■ ■ I ■ ■ ■ ■ i ■ ■ ■ ■ I I I I ■ ■ ■ ■ I I I 1840 

GCGGCAC AAGGCCGAC AGTCGCGTCCCCGCGGGCCAAGAAAAACAGTTCTGGCTGGACAGGCC ACGGGACTTACTTGACG 

Ala Val Phe Arg Leu Ser Ala Gin Gly Arg Pro Val Leu Phe Val Lys Thr Asp Leu Ser Gly Ala Leu Asn Glu Leu 
NEO 



Tuesday, July 02, 2002 2:11 PM FIG- 16 (COflt) p age 4 
pLNHiv-YFP Map.MPD (1 > 7121) Site and Sequence 

Pstl Fspl Pvull 

! ! 

AGGACGAGGCAGCGCGGCT ATCGTGGCTGGCCACGACGGGCGTTCCTTGCGC AGCTGTGCTCGACGTTGTCACTGAAGCG 
■ ■ ■ ■ I I | ■ ... i .... | ... ■ | | | | 1920 

TCCTGCTCCGTCGCGCCGATAGCACCGACCGGTGCTGCCCGCAAGGAACGCGTCGACACGAGCTGCAACAGTGACTTCGC 

Gin Asp Glu Ala Ala Arg Leu Ser Trp Leu Ala Thr Thr Gly Val Pro Cys Ala Ala Val Leu Asp Val Val Thr Glu Ala 
NEO 



GGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGC AGGATCTCCTGTC ATCTCACCTTGCTCCTGCCGAGAAAGTATC 

■ I ■ ... i ■■■ ■ | | , ... i .... i ■ . i .... | .... i ■ ... i — h- 2000 

CCTTCCCTGACCGACGATAACCCGC TTCACGGCCCCGTCCT AGAGGAC AGTAGAGTGGAACGAGGACGGCTC TTTCATAG 

Giy Arg Asp Trp Leu Leu Leu Gly Glu Val Pro Gly Gin Asp Leu Leu Ser Ser His Leu Ala Pro Ala Glu Lys Val Ser 
NEO 

CATCATGGCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAACATCGCA 

I I I I I I I I 2080 

GT AGT ACCGACTACGTTACGCCGCCGACGTATGCGAAC T AGGCCGATGGACGGGTAAGC TGGTGGTTCGCTTTGTAGCGT 

lie Met Ala Asp Ala Met Arg Arg Leu His Thr Leu Asp Pro Ala Thr Cys Pro Phe Asp His Gin Ala Lys His Arg 
NEO 

TCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCTGGACG AAGAGC ATCAGGGGCTCGCGCCA 

■ ■ ■ ■ i ■ ■■■ I I I I I ■ ■ ■ I I I 2160 

AGCTCGCTCGTGC ATGAGCCTACCTTCGGCC AGAAC AGC TAGTCCTACTAGACCTGCTTCTCGTAGTCCCCGAGCGCGGT 

He Glu Arg Ala Arg Thr Arg Met Glu Ala Gly Leu Val Asp Gin Asp Asp Leu Asp Glu Glu His Gin Gly Leu Ala Pro 
NEO 



Sphl Ncol 

I I 

GCCGAACTGTTCGCCAGGCTCAAGGCGCGCATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCC 

— 1 1 I — ■ I I I I I ■ 1 ■ ■ I I 2240 

CGGCTTGACAAGCGGTCCGAGTTCCGCGCGT ACGGGCTGCCGCTCCT AGAGCAGCACTGGGTACCGCTACGGACGAACGG 

Ala Glu Leu Phe Ala Arg Leu Lys Ala Arg Met Pro Asp Gly Glu Asp Leu Val Val Thr His Gly Asp Ala Cys Leu Pro 
NEO 



Nael 

GAAT ATC ATGGTGGAAAATGGCCGCTTTTCTGGATTC ATCGACTGTGGCCGGCTGGGTGTGGCGGACCGCT ATC AGGAC A 

I I I I I ■ ■ ■ I I I 2320 

CTTATAGTACC AC CTTTT ACCGGCGAAAAGACCT AAGT AGC TGAC ACCGGCCGACCC AC ACCGCCTGGCGATAGTCCTGT 

Asn lie Met Vol Glu Asn Gly Arg Phe Ser Gly Phe He Asp Cys Gly Arg Leu Gly Vol Ala Asp Arg Tyr Gin Asp 
NEO 



TAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTACGGTATCGCC 

■ ■ ■ ■ i ■ ■ ■ ■ 1 ■ ■ ■ ■ I I ■ ■ ■ ■ i ■ ... i ■ ... | | i ■ ■ , . | 2400 

ATCGC AACCGATGGGCACT ATAACGACTTCTCGAACCGCCGCTTACCCGACTGGCGAAGGAGC ACGAAATGCCATAGCGG 

lie Ala Leu Ala Thr Arg Asp lie Ala Glu Glu Leu Gly Gly Glu Trp Ala Asp Arg Phe Leu Val Leu Tyr Gly ile Ala 
NEO 



GCTCCCGATTCGC AGCGC ATCGCCTTCTATCGCCTTCTTGACGAGTTCTTCTGAGCGGGAC TC TGGGGTTCGAAATGACC 

I I I I I I ■ ■ ■ — i ■ ■ ■ ■ I I 2480 

CGAGGGCTAAGCGTCGCGT AGCGGAAGATAGC GGAAGAACTGCTC AAGAAGACTCGCCCTGAG ACCCCAAGCTTTACTGG 



Ala Pro Asp Ser Gin Arg He Ala Phe Tyr Arg Leu Leu Asp Glu Phe Phe • 
NEO 



FIG. 16 (cont) 
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pLNHiv-YFP Map.MPD (1 > 7121) Site and Sequence 

GACCAAGCGACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGT 

■ ■ ■ ■ I 1 1 ■ ■ I 1 I I ■ 1 ■ ■ i 1 1 1 1 I I I 2560 

CTGGTTCGCTGCGGGTTGGACGGTAGTGCTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCA 



Nael Smal 

I I 
TTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCGGGCTCGATCCCC 
I ■ ■■■ i ■ ■■■ I 1 ■ ■ ■ I | .... i ■ ... | ■ ... i ■ ... | i i 2640 

AAAGGCCCTGCGGCCGACCTACTAGGAGGTCGCGCCCCT AGAGT ACGACCTCAAGAAGCGGGTGGGGCCCGAGCT AGGGG 



Nrul pull 

TCGCGAGTTGGTTC AGCTGCTGCCTGAGGCTGGACGACCTCGCGGAGTTCTACCGGC AGTGCAAATCCGTCGGCATCCAG 

I I — -h~ — 1 1 I ' I I 2720 

AGCGCTCAACCAAGTCGACGACGGACTCCGACCTGCTGGAGCGCCTCAAGATGGCCGTC ACGTTT AGGCAGCCGTAGGTC 



Pstl 

GAAACCAGC AGCGGCTATCCGCGCATCC ATGCCCCCGAACTGC AGGAGTGGGGAGGC ACGATGGCCGCTTTGGTCGAGGC 

1 I ■ ' ■ ■ I I I I ' ■ ' ■ i ■ ' ■ ■ I 1 2800 

CTTTGGTCGTCGCCGATAGGCGCGTAGGTACGGGGGCTTGACGTCCTCACCCCTCCGTGCTACCGGCGAAACCAGCTCCG 



BamHI 

GGATCCTGGAAGGGCTAATTTGGTCCC AAAGAAGAC AAGAGATCCTTGATCTGTGGATCTACC AC ACACAAGGCTACTTC 

I I ■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ I I ' ' ' I I I 2880 

CCT AGGACCTTCCCGATTAAACC AGGGTTTCTTCTGTTCTCTAGGAACTAGACACCTAGATGGTGTGTGTTCCGATGAAG 



HIV-1 Promoter 



EcoRV 

CCTGATTGGCAGAATTACACACCAGGGCCAGGGATCAGATATCCACTGACCTTTGGATGGTGCTTCAAGCTAGTACCAGT 

I I I I I ■ ■ ■ ■ i ■ ■ ■ ■ I I I 2960 

GGACTAACCGTCTT AATGTGTGGTCCCGGTCCCTAGTC TATAGGTGACTGGAAACCTACCACGAAGTTCGATC ATGGTCA 



HIV-1 Promoter 



TGAGCCAGAGAAGGTAGAAGAGGCCAATGAAGGAGAGAAC AAC AGCTTGTTACACCCTATGAGCCTGC ATGGGATGGAGG 

■ ■ ■ ■ I I ■ ■ ■ ■ I ■ ■ ■ I ■ ■ ■ ■ I I I I 3040 

ACTCGGTCTCTTCCATCTTCTCCGGTTACTTCCTCTCTTGTTGTCGAACAATGTGGGATACTCGGACGTACCCTACCTCC 



HIV-1 Promoter 



ACGCGGAGAAAGAAGTGTTAGTGTGGAGGTTTGACAGC AAACTAGC ATTTCATC ACATGGCCCGAGAGCTGC ATCCGGAG 

I I ■ ' 1 1 I I ■ ' ■ ■ I ■ ■ ■ ■ I I ■ ■ ■ ■ I 3120 

TGCGCCTCTTTCTTCACAATCACACCTCC AAACTGTCGTTTGATCGTAAAGTAGTGTACCGGGCTCTCGACGTAGGCCTC 



HIV-1 Promoter 



TACTACAAAGACTGCTGACATCGAGCTTTCTACAAGGGACTTTCCGCTGGGGACTTTCCAGGGAGGCGTGGCCTGGGCGG 

I ■ ' ■ ' I ■ ■ ■ ■ I | ■ ■ ■ ■ | | | ■ ... i ... ■ i 3200 

ATGATGTTTCTGACGACTGTAGCTCGAAAGATGTTCCCTGAAAGGCGACCCCTGAAAGGTCCCTCCGC ACCGGACCCGCC 



FIG. 16(cont) 
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Pvull Xhol Hindlll 

GACTGGGGAGTGGCGTCCCTCAGATGCTGC ATATAAGC AGCTGCTTTTTGCCTGTACTGGGCCTCGAGAAGCTTGTT ATC 

■ ■ ■ ■ i ■ ■ ■ ■ I I ■ ■ ■ ■ I I ■ ... | ■ ... i ■ ... | | i 328O 

CTGACCCCTCACCGC AGGGAGTCTACGACGT ATATTCGTCGACGAAAAACGGAC ATGACCCGGAGCTCTTCGAAC AAT AG 



•HIV-1 Promoter- 



Ncol 

AC AAGTTTGTACAAAAAAGC AGGCTTCGAAGGAGAT AGAACC AATTCTCT AAGGAAATACTT AACC ATGGTGAGC AAGGG 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ I 1 1 1 1 I 1 1 1 1 i 1 1 1 1 I I 1 I I 3360 

TGTTCAAACATGTTTTTTCGTCCGAAGCTTCCTCTATCTTGGTT AAGAGATTCCTTTATGAATTGGT ACC ACTCGTTCCC 

1 Met Vol Ser Lys Gly 
1 att B1 =— 1 1 YFP 



CGAGGAGCTGTTC ACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCC ACAAGTTCAGCGTGTCCG 
I I I ■ — ■ — i ■ ■ ■ ■ | ■ ■ 1 ■ ■ I I ■ ■ ■ — I I 3440 

GCTCCTCGAC AAGTGGCCCCACC ACGGGTAGGACC AGCTCGACCTGCCGCTGC ATTTGCCGGTGTTC AAGTCGCAC AGGC 

Glu Glu Leu Phe Thr Gly Vol Vol Pro lie Leu Vol Glu Leu Asp Gly Asp Vol Asn Gly His Lys Phe Ser Vol Ser 
YFP 



GCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGG 

— ■ ■ I 1 I I I I I — — i — — + 3520 

CGCTCCCGCTCCCGCTACGGTGGATGCCGTTCGACTGGGACTTC AAGTAGACGTGGTGGCCGTTCGACGGGC ACGGGACC 

Gly Glu Gly Glu Gly Asp Alo Thr Tyr Gly Lys Leu Thr Leu Lys Phe He Cys Thr Thr Gly Lys Leu Pro Vol Pro Trp 
YFP 



Pstl 

I 

CCCACCCTCGTGACCACCTTCGGCTACGGCC TGC AGTGCTTCGCCCGCTACCCCGACCAC ATGAAGC AGC ACGACT TCTT 

I I ■ ■ ■ ■ I 1 ■ ... | ... ■ i | i 36OO 

GGGTGGGAGC ACTGGTGGAAGCCGATGCCGGACGTC ACGAAGCGGGCGATGGGGCTGGTGTACTTCGTCGTGCTGAAGAA 

Pro Thr Leu Vol Thr Thr Phe Gly Tyr Gly Leu Gin Cys Phe Alo Arg Tyr Pro Asp His Met Lys Gin His Asp Phe Phe 
YFP 



C AAGTCCGCC ATGCCCGAAGGCTACGTCC AGGAGCGC ACC ATCTTCTTC AAGGACGACGGC AACTAC AAGACCCGCGCCG 

■ ■ ■ ■ I I ■ ■ I I I I I I 3680 

GTTC AGGCGGTACGGGCTTCCGATGCAGGTCCTCGCGTGGTAGAAGAAGTTCCTGCTGCCGTTGATGTTCTGGGCGCGGC 

Lys Ser Ala Met Pro Glu Gly Tyr Vol Gin Glu Arg Thr lie Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala 
YFP 



AGGTGAAGTTCGAGGGCGAC ACCCTGGTGAACCGC AT CG AGC TGAAGGGC ATC GACTTC AAGGACGACGGC A AC ATC CTG 
■ ■ ■ ■ I ■ ... | ... ■ | | | i i | 376O 

TCCACTTCAAGCTCCCGCTGTGGGACC AC TTGGCGT AGCTCGACTTCCCGTAGCTGAAGTTCCTCCTGCCGTTGTAGGAC 

Glu Vol Lys Phe Glu Gly Asp Thr Leu Vol Asn Arg He Glu Leu Lys Gly He Asp Phe Lys Glu Asp Gly Asn He Leu 
YFP 



GGGC AC AAGCTGGAGT ACAACT AC AAC AGCC ACAACGTC TATATCATGGCCGAC AAGCAGAAGAACGGC ATCAAGGTGAA 

I I I 1 I I I I 3840 

CCCGTGTTCGACCTCATGTTGATGTTGTCGGTGTTGCAGATATAGTACCGGCTGTTCGTCTTCTTGCCGTAGTTCCACTT 

Gly His Lys Leu Glu Tyr Asn Tyr Asn Ser His Asn Vol Tyr He Met Alo Asp Lys Gin Lys Asn Gly He Lys Vol Asn 
YFP 



FIG. 16 (cont) 
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pLNHiv-YFP Map.MPD (1 > 7121) Site and Sequence 

CTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACG 

I I I ■ ■ ■ ■ I I I I ■ ■ ■ ■ I 3920 

GAAGTTCTAGGCGGTGTTGTAGCTCCTGCCGTCGCACGTCGAGCGGCTGGTGATGGTCGTCTTGTGGGGGTAGCCGCTGC 

Phe Lys lie Arg His Asn lie Glu Asp Gly Ser Vol Gin Leu Ala Asp His Tyr Gin Gin Asn Thr Pro lie Gly Asp 
YFP 



GCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCTACCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCAC 

I I \ I I ■ ■ ■ ■ I ■ ■ ■ ■ I I 4000 

CGGGGC ACGACGACGGGCTGTTGGTGATGGACTCGATGGTCAGGCGGGACTCGTTTCTGGGGTTGCTCTTCGCGCTAGTG 

Gly Pro Vol Leu Leu Pro Asp Asn His Tyr Leu Ser Tyr Gin Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His 
YFP 



Notl Xhol 

I I 

ATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTAC AAGTAAAGCGGCCGCACTCG 
I ■ ■ ■ ■ I ■ ... | ■ ... i ■ ... | | I I I 4080 

TACCAGGACGACCTCAAGC ACTGGCGGCGGCCCTAGTGAGAGCCGTACCTGCTCGAC ATGTTC ATTTCGCCGGCGTGAGC 

Met Val Leu Leu Glu Phe VaLThr Ala Ala Gly lie Thr Leu Gly Met Asp Glu Leu Tyr Lys • 
YFP 1 



EcoRV 

Xbal Clal 

AGAT ATCTAGACCCAGC TTTCTTGTAC AAAGTGGTGAT AAC ATCGATAAAAT AAAAGATTTTATTTAGTCTCCAGAAAAA 

. — I I I I I ■ ■ — ■ i ■ ■ ■ ■ I — . — H- 4160 

TCTATAGATCTGGGTCGAAAGAACATGTTTCACCACTATTGTAGCTATTTTATTTTCTAAAATAAATCAGAGGTCTTTTT 



att B2 



Nhel 

GGGGGGAATGAAAGACCCC ACCTGTAGGTTTGGC AAGC T AGCTTAAGT AACGCC ATTTTGC AAGGCATGGAAAAATAC AT 
I I | .... i ■ ... | ... ■ i | | | 4240 

CCCCCCTTACTTTCTGGGGTGGAC ATCCAAACCGTTCGATCGAATTCATTGCGGTAAAACGTTCCGTACCTTTTT ATGTA 



3' LTR 



Pvull EcoRV 

! i 
i i 

AACTGAGAATAGAGAAGTTC AGATC AAGGTCAGGAAC AGATGGAACAGCTGAAT ATGGGCC AAAC AGGATATCTGTGGTA 

I | , ... i ■ | I I ■ ■ ■ ■ i ■ ■ ■ ■ I 1 I 4320 

TTGACTCTTATCTCTTCAAGTCTAGTTCCAGTCCTTGTCTACCTTGTCGACTTATACCCGGTTTGTCCTATAGACACCAT 



■3'LTR- 



Pvull 



EcoRV 



AGCAGTTCCTGCCCCGGCTCAGGGCCAAGAACAGATGGAACAGCTGAATATGGGCCAAACAGGATATCTGTGGTAAGCAG 
1 I 1 ■ I ■ ... | ... ■ | | | ■ ■ ■ ■ i 4400 

TCGTCAAGGACGGGGCCGAGTCCCGGTTCTTGTCTACCTTGTCGACTTAT ACCCGGTTTGTCC TATAGACACCATTCGTC 



3' LTR 
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Site and Sequence 



Page 8 



Xbal 



TTCCTGCCCCGGCTCAGGGCCAAGAACAGATGGTCCCC AGATGCGGTCCAGCCCTCAGCAGTTTCTAGAGAACCATCAGA 

■ ■ ■ ■ i ■ ■ ■ ■ I i I I I I I ■ ■ ■ ■ I 4480 

AAGGACGGGGCCGAGTCCCGGTTCTTGTCTACC AGGGGTCTACGCCAGGTCGGGAGTCGTCAAAGATCTCTTGGTAGTCT 



3' LTR 



TGTTTCC AGGGTGCCCCAAGGACCTGAAATGACCCTGTGCCTTATTTGAACTAACCAATC AGTTCGCTTCTCGCTTCTGT 
— ■ ■ I I ■ ■ I I I 1 ■ I I ■ ■ ■ ■ i ■ ■ ■ ■ 1 4560 

ACAAAGGTCCCACGGGGTTCCTGGACTTT ACTGGGACACGGAAT AAACTTGATTGGTTAGTC AAGCGAAGAGCGAAGAC A 



3' LTR 



Sad 



Narl 



TCGCGCGCTTCTGCTCCCCGAGC TC AATAAAAGAGCCC ACAACCCCTCACTCGGGGCGCC AGTCCTCCGATTGACTGAGT 
I I I ■ 1 ■ 1 ■ ■■■ i ■ ■■ ■ I ' I — ■ ■ I I 4640 

AGCGCGCGAAGACGAGGGGCTCGAGTT ATTTTCTCGGGTGTTGGGGAGTGAGCCCCGCGGTC AGGAGGCT AACTGACTCA 



3' LTR 



Smal Kpnl 

CGCCCGGGTACCCGTGTATCCAATAAACCCTCTTGCAGTTGCATCCGACTTGTGGTCTCGCTGTTCCTTGGGAGGGTCTC 

I I I I | , ... i ■ ... | I 4720 

GCGGGCCCATGGGCAC AT AGGTT ATTTGGGAGAACGTC AACGTAGGCTGAAC ACC AGAGCGAC AAGGAACCCTCCC AGAG 



3' LTR 



CTCTGAGTGATTGACTACCCGTCAGCGGGGGTCTTTCATTTGGGGGCTCGTCCGGGATCGGGAGACCCCTGCCCAGGGAC 
■ ■ ■ ■ I ■ ' ■ ' 1 | ■ ... i ■ ... | | | , , , , i , , , , | , ■ , , i , , , , | 4800 

GAGACTCACTAACTGATGGGC AGTCGCCCCCAGAAAGTAAACCCCCGAGCAGGCCCTAGCCCTCTGGGGACGGGTCCCTG 



•3' LTR- 



C ACCGACCCACCACCGGGAGGT AAGCTGGCTGCCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACAC ATGCAGCT 

■ ■ ■ ■ i ■ ■ ■ ■ I I I I I I 1 ■ ■ ■ ■ I 4880 

GTGGCTGGGTGGTGGCCCTCC ATTCGACCGACGGAGCGCGCAAAGCCACTACTGCC AC TTTTGGAGACTGTGTACGTCGA 

CCCGGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCG 

■ ■ ■ ■ I I I 1 1 ■ ■ I ■ ■ ■ ■ 1 | , ... i ■ ... | ■ ■ ■ ■ | 496O 

GGGCCTCTGCCAGTGTCGAACAGACATTCGCCTACGGCCCTCGTCTGTTCGGGCAGTCCCGCGCAGTCGCCCACAACCGC 



GGTGTCGGGGCGCAGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTTAACTATGCGGCATCAGAGCAGA 

1 1 1 1 1 * 1 * 1 I 1 I 1 ■ ■ 1 I I I 1 1 1 ■ i ■ 1 1 ■ I I 5040 

CCACAGCCCCGCGTCGGTACTGGGTCAGTGCATCGCTATCGCCTCACATATGACCGAATTGATACGCCGTAGTCTCGTCT 



Ndel 



TTGTACTGAGAGTGCACC ATATGCGGTGTGAAAT ACCGCACAGATGCGTAAGGAGAAAAT ACCGCATC AGGCGCTCTTCC 

I I I I | ■ ■ ■ ■ i ■ ■ ■ ■ l [ i 

AACATGACTCTC ACGTGGT ATACGCCACACTTTATGGCGTGTCTACGCATTCCTCTTTT ATGGCGTAGTCCGCGAGAAGG 



5120 
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GCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACG 

I I I 1 ■ 1 ■ I 1 I i I 5200 

CGAAGGAGCGAGTGACTGAGCGACGCGAGCCAGC AAGCCGACGCCGCTCGCC AT AGTCGAGTGAGTTTCCGCC ATTATGC 

GT T AT CC AC AGAATCAGGGGATAACGCAGGAAAGAAC AT GTGAGCAAAAGGCCAGCAA A AGGCCAGGAACCGT A AAAAGG 
I ■ ... | ... ■ | | | | ... i ... | | 5280 

CAATAGGTGTCTTAGTCCCCTATTGCGTCCTTTCTTGTACACTCGTTTTCCGGTCGTTTTCCGGTCCTTGGCATTTTTCC 

CCGCGTTGCTGGCGTTTTTCC ATAGGCTCCGCCCCCCTGACGAGC ATC AC AAAAATCGACGCTC AAGTC AGAGGTGGCGA 
I | | ■ , ■ ■ i ■ , ■ , | ■ ■ , ■ | | { l 5360 

GGCGCAACGACCGCAAAAAGGT ATCCGAGGCGGGGGGAC TGCTCGTAGTGTTTTTAGCTGCGAGTTCAGTCTCC ACCGCT 

AACCCGAC AGGACTATAAAGAT ACC AGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCT 

1 ■ ■ ■ ■ i ■ ■ ■ ■ I I 1 I | | ... i 5440 

TTGGGCTGTCCTGATATTTCTATGGTCCGCAAAGGGGGACCTTCGAGGGAGCACGCGAGAGGACAAGGCTGGGACGGCGA 

TACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGG 

I ■ ... i .... I .... i .... I .... i ■ — ■ ■ I | ■ ■ i ■ ■ ■ ■ I I 1 5520 

ATGGCC T ATGGAC AGGCGGAAAGAGGGAAGCCCTTCGC ACCGCGAAAGAGTATCGAGTGCGAC ATCC AT AGAGTCAAGCC 

TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTAT 

■ ■ ■ ■ I I I I i ■ ... m ■■■ | ■ ■■■ i .... i | 5600 

AC ATCC AGC AAGCGAGGTTCGACCCGAC ACACGTGCTTGGGGGGCAAGTCGGGCTGGCGACGCGGAATAGGCC ATTGATA 

CGTCTTGAGTCCAACCCGGTAAGACACGAC TTATCGCC ACTGGCAGC AGCCACTGGTAAC AGGATTAGC AGAGCG AGGTA 

■ ■ ■ ■ ■ ■ ■ ■ ■ I I I 1 I I I ■ ■ ■ ■ I 5680 

GCAGAACTCAGGTTGGGCC ATTCTGTGCTG AATAGCGGTGACCGTCGTCGGTGACC ATTGTCC TAATCGTCTCGCTCC AT 

TGTAGGCGGTGCT ACAGAGTTC TTGAAGTGGTGGCCT AACT ACGGCTACACTAGAAGGAC AGT ATTTGGTATCTGCGCTC 

I I I I ■ ■ ■ ■ I I I I 5760 

AC ATCCGCC ACGATGTCTCAAGAACTTC ACCACCGGATTGATGCCGATGTGATCTTCCTGTC ATAAACC ATAGACGCGAG 

TGCTGAAGCC AGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGC AAAC AAACC ACCGCTGGTAGCGGTGGTTTT 

— H I I | ■ ... i ■ ... | I I I 5840 

ACGACTTCGGTC AATGGAAGCCTTTTTCTC AACC ATCGAGAACTAGGCCGTTTGTTTGGTGGCGACC ATCGCC ACC AAAA 

TTTGTTTGCAAGC AGC AGATTACGCGC AGAAAAAAAGG ATC TCAAGAAG ATCC TTTGATCTTTTCTACGGGGTCTGACGC 

I ■ ■ ■ ■ I ' ■ ' ■ I 1 I I 1 1 ■ 1 I 5920 

AAAC AAACGTTCGTCGTCTAATGCGCGTCTTTTTTTCC TAG AGTTCTTCT AGGAAACT AGAAAAGATGCCCCAGACTGCG ' 

BspHI Oral 

TCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATT ATC AAAAAGGATCTTC ACC TAG ATCC TTTTAAATT 

— ■ — I | ■ ... i ■ ■■■ i i | ■ ■ ■ ■ i ■ ■ ■ ■ | ■ ■ ■ . i . ■ ■ ■ | i 6000 

AGTC ACC TTGCTTTTGAGTGCAATTCCCTAAAACC AGT ACTCTAATAGTTTTTCCT AGAAGTGGATCTAGGAAAATTT AA 

Dral 

i 

AAAAATGAAGTTTTAAATC AATCTAAAGTATATATGAGT AAACTTGGTCTGACAGTT ACC AATGCTTAATC AGTGAGGCA 

■ ■ ■ ■ 1 ■ ■ ■ ■ I I ■ ■ ■ ■ I ■ ■ ■ ■ 1 I ■ ■ ■ ■ I I 6080 

TTTTTACTTC AAAATTTAGTT AGATTTC ATAT AT ACTC ATTTGAACC AGACTGTCAATGGTTACGAATT AGTC ACTCCGT 

. • Trp His Lys He Leu Ser Alo 
1 AMP 



FIG. 16(cont) 
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CCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGG 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ' I I | .... t .... | | ■ ■ ■ ■ | i 6160 

GGATAGAGTCGCTAGACAGATAAAGC AAGT AGGTATCAACGGACTGAGGGGC AGCACATCTATTGATGCTATGCCCTCCC 

Gly lie Glu Alo Ite Gin Arg Asn Arg Glu Asp Met Thr Alo Gin Ser Gly Thr Thr Tyr He Vol Vol lie Arg Ser Pro 
AMP 



CTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGC 
I I 1 | | i ■ ■ ■ ■ | ■ ■ , , | 6240 

GAATGGTAGACCGGGGTC ACGACGTTACTATGGCGCTC TGGGTGCGAGTGGCCGAGGTCTAAATAGTCGTTATTTGGTCG 

Lys Gly Asp Pro Gly Leu Alo Alo He He Gly Arg Ser Gly Arg Glu Gly Alo Gly Ser Lys Asp Alo He Phe Trp Gly 
AMP 



CAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGC AACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCT 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ' i ■ ■ ' ■ I I I I | ■ ■ ■ ■ | 1 1 ■ ■ I 6320 

GTCGGCCTTCCCGGCTCGCGTCTTC ACC AGGACGTTGAAATAGGCGGAGGTAGGTC AGATAATTAAC AACGGCCCTTCGA 

Alo Pro Leu Alo Ser Arg Leu Leu Pro Gly Ala Vol Lys Asp Ala Glu Met Trp Asp He Leu Gin Gin Arg Ser Ala 
AMP 



Fspl Pstl 

| \ 

AGAGTAAGTAGTTCGCC AGTT AATAGTTTGCGCAACGTTGTTGCC ATTGCTGC AGGCATCGTGGTGTC ACGCTCGTCGTT 
I I I I I | ■ ... i ■ ... | ■ ... i ■ ... ] 6400 

TCTCATTCATCAAGCGGTCAATTATCAAACGCGTTGCAACAACGGTAACGACGTCCGTAGCACCACAGTGCGAGCAGCAA 

Leu Thr Leu Leu Glu Gly Thr Leu Leu Lys Arg Leu Thr Thr Ala Met Ala Ala Pro Met Thr Thr Asp Arg Glu Asp Asn 
AMP 



TGGT ATGGCTTCATTCAGCTCCGGTTCCC AACGATCAAGGCGAGTTAC ATGATCCCCC ATGTTGTGCAAAAAAGCGGTT A 

I I ■ ' ■ ' I ■ ... | ... ■ I I I I 6480 

ACCATACCGAAGTAAGTCGAGGCCAAGGGTTGCTAGTTCCGCTCAATGTACTAGGGGGTACAACACGTTTTTTCGCCAAT 

Pro He Ala Glu Asn Leu Glu Pro Glu Trp Arg Asp Leu Arg Thr Vol His Asp Gly Met Asn His Leu Phe Ala Thr Leu 
AMP 

Pvul 

GCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAAT 

I I ■ ■ ■ ■ i ■ ■ ■ ■ 1 I ■ - I I I I 6560 

CGAGGAAGCCAGGAGGCTAGC AACAGTCTTC ATTCAACCGGCGTC ACAATAGTGAGTACC AAT ACCGTCGTGACGT ATT A 

Glu Lys Pro Gly Gly He Thr Thr Leu Leu Leu Asn Ala Ala Thr Asn Asp Ser Met Thr He Ala Alo Ser Cys Leu 
AMP 

TCTCTT ACTGTCATGCCATCCGTAAGATGC TTTTCTGTGACTGGTGAGTACTC AACCAAGTC ATTCTGAGAAT AGTGTAT 

I | , ... i .... | | i I I I 6640 

AGAGAATGACAGT ACGGTAGGCATTCTACGAAAAGAC ACTGACCACTC ATGAGTTGGTTC AGTAAGAC TCTTATC AC ATA 

Glu Arg Vol Thr Met Gly Asp Thr Leu His Lys Glu Thr Vol Pro Ser Tyr Glu Vol Leu Asp Asn Gin Ser Tyr His He 
AMP 

Dral 

I 

GCGGCGACCGAGTTGCTCTTGCCCGGCGTC AAC ACGGGAT AAT ACCGCGCCAC ATAGC AG AACTTTAAAAGTGCTC ATC A 
' ' ' ■ I | .... i .... | I ■ ■ ■ ■ I ■ 1 ■ ■ I , ... | ... ■ | 6720 

CGCCGCTGGCTCAACGAGAACGGGCCGCAGTTGTGCCCTATTATGGCGCGGTGT ATCGTCTTGAAATTTTC ACGAGTAGT 
Arg Arg Gly Leu Gin Glu Gin Gly Ala Asp Vol Arg Ser Leu Vol Ala Gly Cys Leu Leu Vol Lys Phe Thr Ser Met Met 

AMP : 
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TTGGAAAACGTTCTTCGGGGCGAAAACTCTC AAGGATCTTACCGCTGTTGAGATCC AGTTCGATGT AACCCACTCGTGC A 

I I ■ I I 1 ■ ■ ■ I | .... i .... | | 6800 

AACCTTTTGC AAGAAGCCCCGCTTTTGAGAGTTCCTAGAATGGCGACAACTCTAGGTC AAGCTACATTGGGTGAGCACGT 

Pro Phe Arg Glu Glu Pro Arg Phe Ser Glu Leu He Lys Gly Ser Asn Leu Asp Leu Glu He Tyr Gly Vol Arg Ala 
AMP 



CCCAACTGATCTTC AGC ATCTTTTACTTTCACC AGCGTTTC TGGGTGAGCAAAAAC AGGAAGGCAAAATGCCGC AAAAAA 
I I ■ ■ ■ ■ 1 I I — -H— — 1 1 ■ ■ ■ I 6880 

GGGTTGACTAGAAGTCGTAGAAAATGAAAGTGGTCGC AAAGACCC ACTCGTTTTTGTCCTTCCGTTTT ACGGCGTTTTTT 

Gly Leu Gin Asp Glu Ala Asp Lys Vol Lys Vol Leu Thr Glu Pro His Ala Phe Val Pro Leu Cys Phe Ala Alo Phe Phe 
AMP 



GGGAATAAGGGCGAC ACGGAAATGTTGAATACTCATACTCTTCCTTTTTC AATATTATTGAAGC ATTT ATC AGGGTTATT 

I I I I 1 1 1 1 I I I I 6960 

CCCTTATTCCCGCTGTGCCTTTACAACTTATGAGTATGAGAAGGAAAAAGTTATAATAACTTCGTAAATAGTCCCAATAA 



Pro He Leu Alo Val Arg Phe His Gin lie Ser Met. 
AMP 1 



BspHI 

GTCTCATGAGCGGATAC ATATTTGAATGT ATTT AGAAAAATAAAC AAATAGGGGTTCCGCGC ACATTTCCCCGAAAAGTG 

1 H I I I I I h 7040 

CAGAGTACTCGCCTATGTATAAACTTACATAAATCTTTTTATTTGTTTATCCCCAAGGCGCGTGTAAAGGGGCTTTTCAC 



BspHI 

CC ACC TGACGTC T AAG AAAC C AT T ATT ATC ATG AC ATT AACCT AT AAAAATAGGCGTATC AC GAGGCCCTTTCGTCTTC A 
I I | .... i .... | | | i | 7120 

GGTGGACTGCAGATTCTTTGGTAATAATAGTACTGTAATTGGATATTTTTATCCGCATAGTGCTCCGGGAAAGCAGAAGT 



A 

* 7121 
T 



4 
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